Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

2018

YAKE! Collection-Independent Automatic Keyword Extractor

Authors
Campos, R; Mangaravite, V; Pasquali, A; Jorge, AM; Nunes, C; Jatowt, A;

Publication
ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018)

Abstract
In this paper, we present YAKE!, a novel feature-based system for multi-lingual keyword extraction from single documents, which supports texts of different sizes, domains or languages. Unlike most systems, YAKE! does not rely on dictionaries or thesauri, neither it is trained against any corpora. Instead, we follow an unsupervised approach which builds upon features extracted from the text, making it thus applicable to documents written in many different languages without the need for external knowledge. This can be beneficial for a large number of tasks and a plethora of situations where the access to training corpora is either limited or restricted. In this demo, we offer an easy to use, interactive session, where users from both academia and industry can try our system, either by using a sample document or by introducing their own text. As an add-on, we compare our extracted keywords against the output produced by the IBM Natural Language Understanding (IBM NLU) and Rake system. YAKE! demo is available at http://bit.ly/YakeDemoECIR2018. A python implementation of YAKE! is also available at PyPi repository (https://pypi.python.org/pypi/yake/).

2018

Data Leakage in Java Applets with Exception Mechanism

Authors
Bernardeschi, C; Masci, P; Santone, A;

Publication
Proceedings of the Second Italian Conference on Cyber Security, Milan, Italy, February 6th - to - 9th, 2018.

Abstract
It is becoming more and more important to study methods for protecting sensitive data in computer and communication systems from unauthorized access, use, modification, destruction or deletion. Sensitive data include intellectual properties, payment information, personal files, personal credit card and other information depending on the business and the industry. Therefore, data leakage is considered an emerging security threat to organizations and companies. In this paper we present a static analysis method for information flow analysis in Java bytecode with exceptions. Exceptions are special events that break the normal execution flow. They can be used as a device to leak high security data since exception throwing can be accurately driven. The proposed analysis is capable of tracing information flow caused by exceptions by identifying instruction handler protected instructions as virtual control instructions. A malicious Java applet that clones the user secret PIN through exceptions is shown.

2018

Simulation Beats Richness: New Data-Structure Lower Bounds

Authors
Chattopadhyay, A; Koucky, M; Loff, B; Mukhopadhyay, S;

Publication
STOC'18: PROCEEDINGS OF THE 50TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING

Abstract
We develop a technique for proving lower bounds in the setting of asymmetric communication, a model that was introduced in the famous works of Miltersen (STOC'94) and Miltersen, Nisan, Safra and Wigderson (STOC'95). At the core of our technique is a novel simulation theorem: Alice gets a p x n matrix x over F-2 and Bob gets a vector y is an element of F-2(n). Alice and Bob need to evaluate f (x center dot y) for a Boolean function f : {0, 1}(p) -> {0, 1}. Our simulation theorems show that a deterministic/randomized communication protocol exists for this problem, with cost C center dot n for Alice and C for Bob, if and only if there exists a deterministic/randomized parity decision tree of cost Theta(C) for evaluating f. As applications of this technique, we obtain the following results: (i) The first strong lower-bounds against randomized data-structure schemes for the Vector-Matrix-Vector product problem over F-2. Moreover, our method yields strong lower bounds even when the data-structure scheme has tiny advantage over random guessing. (ii) The first lower bounds against randomized data-structures schemes for two natural Boolean variants of Orthogonal Vector Counting. (iii) We construct an asymmetric communication problem and obtain a deterministic lower-bound for it which is provably better than any lower-bound that may be obtained by the classical Richness Method of Miltersen et al.. This seems to be the first known limitation of the Richness Method in the context of proving deterministic lower bounds.

2018

Transcription factor activities enhance markers of drug sensitivity in cancer

Authors
Garcia Alonso, L; Iorio, F; Matchan, A; Fonseca, N; Jaaks, P; Peat, G; Pignatelli, M; Falcone, F; Benes, CH; Dunham, I; Bignell, G; McDade, SS; Garnett, MJ; Saez Rodriguez, J;

Publication
Cancer Research

Abstract
Transcriptional dysregulation induced by aberrant transcription factors (TF) is a key feature of cancer, but its global influence on drug sensitivity has not been examined. Here, we infer the transcriptional activity of 127 TFs through analysis of RNA-seq gene expression data newly generated for 448 cancer cell lines, combined with publicly available datasets to survey a total of 1,056 cancer cell lines and 9,250 primary tumors. Predicted TF activities are supported by their agreement with independent shRNA essentiality profiles and homozygous gene deletions, and recapitulate mutant-specific mechanisms of transcriptional dysregulation in cancer. By analyzing cell line responses to 265 compounds, we uncovered numerous TFs whose activity interacts with anticancer drugs. Importantly, combining existing pharmacogenomic markers with TF activities often improves the stratification of cell lines in response to drug treatment. Our results, which can be queried freely at dorothea.opentargets.io, offer a broad foundation for discovering opportunities to refine personalized cancer therapies. Significance: Systematic analysis of transcriptional dysregulation in cancer cell lines and patient tumor specimens offers a publicly searchable foundation to discover new opportunities to refine personalized cancer therapies. © 2017 American Association for Cancer Research.

2018

Dynamics of a fixed bed adsorption column in the kinetic separation of hexane isomers in MOF ZIF-8

Authors
Mendes, PAP; Rodrigues, AE; Almeida, JP; Silva, JAC;

Publication
Springer Proceedings in Mathematics and Statistics

Abstract
A fixed bed adsorption mathematical model has been developed to describe the kinetic separation of hexane isomers when they flow through a packed bed containing the microporous Metal-Organic Framework (MOF) ZIF-8 adsorbent. The flow of inert and adsorbable species through the fixed bed is modeled with fundamental differential equations according to the mass and heat conservation laws, a general isotherm to describe adsorption equilibrium and a lumped kinetic mass transfer mechanism between bulk gas phase and the porous solid. It is shown that a proper combination of two characteristic times (the residence time of the gas in the fixed bed, tfb and the characteristic time of diffusion of solutes into the pores tdif) can lead to very different dynamics of fixed bed adsorbers where in a limiting case can gives rise to a spontaneous breakthrough curves of solutes. The numerical simulations of an experimental breakthrough curve with the developed mathematical model clearly explain the complete separation between linear n-Hexane (nHEX) and the respective branched isomers: 3-Methyl-Pentane (3MP) and 2, 2-Dimethyl-Butane (22DMB). The separation is due to significant differences in the diffusivity parameters tdif between 3MP and 22DMB and the residence time of the gas mixture tfb within the fixed bed. This work shows the importance of mathematical modelling for the comprehension and design of adsorption separation processes. © 2018, Springer International Publishing AG, part of Springer Nature.

2018

Towards a transactional and analytical data management system for Big Data

Authors
Luis Coelho, FAC;

Publication

Abstract

  • 1918
  • 4503