Fine-Tuning Textrank for Legal Document Summarization: A Bayesian Optimization Based Approach

A Gaussian-process surrogate model based on already acquired data is employed to approximate an unknown target surface. In order to optimally locate the next function evaluations in parameter space a whole variety of utility functions are at one’s disposal. However, good choice of a specific utility or a certain combination of them prepares the fastest way to determine a best surrogate surface or its extremum for lowest amount of additional data possible. In this paper, we propose to consider the global (integrated) variance as an utility function, i.e., to integrate the variance of the surrogate over a finite volume in parameter space. It turns out that this utility not only complements the tool set for fine tuning investigations in a region of interest but expedites the optimization procedure in toto.

Download Full-text

Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning

Journal of Information Science ◽

10.1177/0165551521990616 ◽

2021 ◽

pp. 016555152199061

Author(s):

Salima Lamsiyah ◽

Abdelkader El Mahdaouy ◽

Saïd El Alaoui Ouatik ◽

Bernard Espinasse

Keyword(s):

Transfer Learning ◽

State Of The Art ◽

Representation Learning ◽

Fine Tuning ◽

Text Representation ◽

Document Summarization ◽

Semantic Relationships ◽

Benchmark Datasets ◽

Tuning Methods ◽

Fine Tune

Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. To overcome this issue, the current study proposes an unsupervised method for extractive multi-document summarization based on transfer learning from BERT sentence embedding model. Moreover, to improve sentence representation learning, we fine-tune BERT model on supervised intermediate tasks from GLUE benchmark datasets using single-task and multi-task fine-tuning methods. Experiments are performed on the standard DUC’2002–2004 datasets. The obtained results show that our method has significantly outperformed several baseline methods and achieves a comparable and sometimes better performance than the recent state-of-the-art deep learning–based methods. Furthermore, the results show that fine-tuning BERT using multi-task learning has considerably improved the performance.

Download Full-text

Neural network hyperparameter optimization for prediction of real estate prices in Helsinki

PeerJ Computer Science ◽

10.7717/peerj-cs.444 ◽

2021 ◽

Vol 7 ◽

pp. e444

Author(s):

Jussi Kalliola ◽

Jurgita Kapočiūtė-Dzikienė ◽

Robertas Damaševičius

Keyword(s):

Real Estate ◽

Fine Tuning ◽

Bayesian Optimization ◽

Ann Model ◽

Bayesian Optimization Algorithm ◽

Real Estate Price ◽

Price Prediction ◽

Property Owners ◽

Real Estate Prices ◽

The Empirical Analysis

Accurate price evaluation of real estate is beneficial for many parties involved in real estate business such as real estate companies, property owners, investors, banks, and financial institutes. Artificial Neural Networks (ANNs) have shown promising results in real estate price evaluation. However, the performance of ANNs greatly depends upon the settings of their hyperparameters. In this paper, we apply and optimize an ANN model for real estate price prediction in Helsinki, Finland. Optimization of the model is performed by fine-tuning hyper-parameters (such as activation functions, optimization algorithms, etc.) of the ANN architecture for higher accuracy using the Bayesian optimization algorithm. The results are evaluated using a variety of metrics (RMSE, MAE, R2) as well as illustrated graphically. The empirical analysis of the results shows that model optimization improved the performance on all metrics (reaching the relative mean error of 8.3%).

Download Full-text

High-dimensional Bayesian optimization using low-dimensional feature spaces

Machine Learning ◽

10.1007/s10994-020-05899-z ◽

2020 ◽

Vol 109 (9-10) ◽

pp. 1925-1943 ◽

Cited By ~ 1

Author(s):

Riccardo Moriconi ◽

Marc Peter Deisenroth ◽

K. S. Sesh Kumar

Keyword(s):

Optimization Problem ◽

Feature Space ◽

Dimensional Subspace ◽

Global Optimum ◽

Black Box ◽

Fine Tuning ◽

Bayesian Optimization ◽

Feature Spaces ◽

Low Dimensional ◽

Lower Dimensional

Abstract Bayesian optimization (BO) is a powerful approach for seeking the global optimum of expensive black-box functions and has proven successful for fine tuning hyper-parameters of machine learning models. However, BO is practically limited to optimizing 10–20 parameters. To scale BO to high dimensions, we usually make structural assumptions on the decomposition of the objective and/or exploit the intrinsic lower dimensionality of the problem, e.g. by using linear projections. We could achieve a higher compression rate with nonlinear projections, but learning these nonlinear embeddings typically requires much data. This contradicts the BO objective of a relatively small evaluation budget. To address this challenge, we propose to learn a low-dimensional feature space jointly with (a) the response surface and (b) a reconstruction mapping. Our approach allows for optimization of BO’s acquisition function in the lower-dimensional subspace, which significantly simplifies the optimization problem. We reconstruct the original parameter space from the lower-dimensional subspace for evaluating the black-box function. For meaningful exploration, we solve a constrained optimization problem.

Download Full-text

Towards Explainable AI: Assessing the Usefulness and Impact of Added Explainability Features in Legal Document Summarization

Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems ◽

10.1145/3411763.3443441 ◽

2021 ◽

Author(s):

Milda Norkute ◽

Nadja Herger ◽

Leszek Michalak ◽

Andrew Mulder ◽

Sally Gao

Keyword(s):

Document Summarization ◽

Explainable Ai ◽

Legal Document

Download Full-text

Adaptive divergence for rapid adversarial optimization

PeerJ Computer Science ◽

10.7717/peerj-cs.274 ◽

2020 ◽

Vol 6 ◽

pp. e274

Author(s):

Maxim Borisyak ◽

Tatiana Gaintseva ◽

Andrey Ustyuzhanin

Keyword(s):

High Capacity ◽

Real Data ◽

Fine Tuning ◽

Bayesian Optimization ◽

Adaptive Divergence ◽

Event Generator ◽

Optimization Task ◽

Order Of Magnitude ◽

Speed Up ◽

Jensen Shannon Divergence

Adversarial Optimization provides a reliable, practical way to match two implicitly defined distributions, one of which is typically represented by a sample of real data, and the other is represented by a parameterized generator. Matching of the distributions is achieved by minimizing a divergence between these distribution, and estimation of the divergence involves a secondary optimization task, which, typically, requires training a model to discriminate between these distributions. The choice of the model has its trade-off: high-capacity models provide good estimations of the divergence, but, generally, require large sample sizes to be properly trained. In contrast, low-capacity models tend to require fewer samples for training; however, they might provide biased estimations. Computational costs of Adversarial Optimization becomes significant when sampling from the generator is expensive. One of the practical examples of such settings is fine-tuning parameters of complex computer simulations. In this work, we introduce a novel family of divergences that enables faster optimization convergence measured by the number of samples drawn from the generator. The variation of the underlying discriminator model capacity during optimization leads to a significant speed-up. The proposed divergence family suggests using low-capacity models to compare distant distributions (typically, at early optimization steps), and the capacity gradually grows as the distributions become closer to each other. Thus, it allows for a significant acceleration of the initial stages of optimization. This acceleration was demonstrated on two fine-tuning problems involving Pythia event generator and two of the most popular black-box optimization algorithms: Bayesian Optimization and Variational Optimization. Experiments show that, given the same budget, adaptive divergences yield results up to an order of magnitude closer to the optimum than Jensen-Shannon divergence. While we consider physics-related simulations, adaptive divergences can be applied to any stochastic simulation.

Download Full-text

An Automatic Legal Document Summarization and Search Using Hybrid System

Advances in Intelligent Systems and Computing - Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) ◽

10.1007/978-3-642-35314-7_27 ◽

2013 ◽

pp. 229-236 ◽

Cited By ~ 3

Author(s):

Selvani Deepthi Kavila ◽

Vijayasanthi Puli ◽

G. S. V. Prasada Raju ◽

Rajesh Bandaru

Keyword(s):

Hybrid System ◽

Document Summarization ◽

Legal Document

Download Full-text

Fine Tuning the Prediction of the Compressive Strength of Concrete : A Bayesian Optimization Based Approach

10.1109/inista52262.2021.9548593 ◽

2021 ◽

Author(s):

Rafat Ashraf Joy

Keyword(s):

Compressive Strength ◽

Fine Tuning ◽

Bayesian Optimization ◽

Compressive Strength Of Concrete

Download Full-text

Improving Legal Case Summarization Using Document-Specific Catchphrases

10.3233/faia210320 ◽

2021 ◽

Author(s):

Arpan Mandal ◽

Paheli Bhattacharya ◽

Sekhar Mandal ◽

Saptarshi Ghosh

Keyword(s):

Performance Metrics ◽

Specific Information ◽

Document Summarization ◽

Domain Specific ◽

Legal Case ◽

Different Types ◽

Legal Document

Legal case summarization is an important problem, and several domain-specific summarization algorithms have been applied for this task. These algorithms generally use domain-specific legal dictionaries to estimate the importance of sentences. However, none of the popular summarization algorithms use document-specific catchphrases, which provide a unique amalgamation of domain-specific and document-specific information. In this work, we assess the performance of two legal document summarization algorithms, when two different types of catchphrases are incorporated in the summarization process. Our experiments confirm that both the summarization algorithms show improvement across all performance metrics, with the incorporation of document-specific catchphrases.

Download Full-text