Cookies Policy
We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out More
Close
  • Menu
About

About

Mohammad Nozari is a data scientist and a postdoctoral researcher at INESC TEC, working in NanoStima project where he uses his expertise to extract knowledge from electrocardiography signals. He is also collaborating with the Digi-NewB project where he applies metalearning on the fetal heart rate (FHR) for knowledge discovery about the healthiness of the babies after born. Previously, Mohammad worked as a researcher in several European projects like MANTIS (root cause analysis for predictive maintenance) and DRIVE-IN (data collection from Vehicular networks). He graduated from Faculty of Engineering, University of Porto, FEUP, with a Ph.D. degree in 2016.

Interest
Topics
Details

Details

001
Publications

2017

Entropy and Compression Capture Different Complexity Features: The Case of Fetal Heart Rate

Authors
Santos, JM; Gonçalves, H; Bernardes, J; Coelho Antunes, LF; Zarmehri, MN; Santos, CC;

Publication
Entropy

Abstract

2016

Collaborative Data Analysis in Hyperconnected Transportation Systems

Authors
Zarmehri, MN; Soares, C;

Publication
COLLABORATION IN A HYPERCONNECTED WORLD

Abstract
Taxi trip duration affects the efficiency of operation, the satisfaction of drivers, and, mainly, the satisfaction of the customers, therefore, it is an important metric for the taxi companies. Especially, knowing the predicted trip duration beforehand is very useful to allocate taxis to the taxi stands and also finding the best route for different trips. The existence of hyperconnected network can help to collect data from connected taxis in the city environment and use it collaboratively between taxis for a better prediction. As a matter of fact, the existence of high volume of data, for each individual taxi, several models can be generated. Moreover, taking into account the difference between the data collected by taxis, this data can be organized into different levels of hierarchy. However, finding the best level of granularity which leads to the best model for an individual taxi could be computationally expensive. In this paper, the use of metalearning for addressing the problem of selection of the right level of the hierarchy and the right algorithm that generates the model with the best performance for each taxi is proposed. The proposed approach is evaluated by the data collected in the Drive-In project. The results show that metalearning helps the selection of the algorithm with the best performance.

2015

Metalearning to Choose the Level of Analysis in Nested Data: A Case Study on Error Detection in Foreign Trade Statistics

Authors
Zarmehri, MN; Soares, C;

Publication
2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)

Abstract
Traditionally, a single model is developed for a data mining task. As more data is being collected at a more detailed level, organizations are becoming more interested in having specific models for distinct parts of data (e. g. customer segments). From the business perspective, data can be divided naturally into different dimensions. Each of these dimensions is usually hierarchically organized (e. g. country, city, zip code), which means that, when developing a model for a given part of the problem (e. g. a zip code) the training data may be collected at different levels of this nested hierarchy (e. g. the same zip code, the city and the country it is located in). Selecting different levels of granularity may change the performance of the whole process, so the question is which level to use for a given part. We propose a metalearning model which recommends a level of granularity for the training data to learn the model that is expected to obtain the best performance. We apply decision tree and random forest algorithms for metalearning. At the base level, our experiment uses results obtained by outlier detection methods on the problem of detecting errors in foreign trade transactions. The results show that using metalearning help finding the best level of granularity.

2015

Using Metalearning for Prediction of Taxi Trip Duration Using Different Granularity Levels

Authors
Zarmehri, MN; Soares, C;

Publication
Advances in Intelligent Data Analysis XIV

Abstract
Trip duration is an important metric for the management of taxi companies, as it affects operational efficiency, driver satisfaction and, above all, customer satisfaction. In particular, the ability to predict trip duration in advance can be very useful for allocating taxis to stands and finding the best route for trips. A data mining approach can be used to generate models for trip time prediction. In fact, given the amount of data available, different models can be generated for different taxis. Given the difference between the data collected by different taxis, the best model for each one can be obtained with different algorithms and/or parameter settings. However, finding the configuration that generates the best model for each taxi is computationally very expensive. In this paper, we propose the use of metalearning to address the problem of selecting the algorithm that generates the model with the most accurate predictions for each taxi. The approach is tested on data collected in the Drive-In project. Our results show that metalearning can help to select the algorithm with the best accuracy.

2013

POPSTAR at RepLab 2013: Name ambiguity resolution on Twitter

Authors
Saleiro, P; Rei, L; Pasquali, A; Soares, C; Teixeira, J; Pinto, F; Nozari, M; Felix, C; Strecht, P;

Publication
CEUR Workshop Proceedings

Abstract
Filtering tweets relevant to a given entity is an important task for online reputation management systems. This contributes to a reliable analysis of opinions and trends regarding a given entity. In this paper we describe our participation at the Filtering Task of RepLab 2013. The goal of the competition is to classify a tweet as relevant or not relevant to a given entity. To address this task we studied a large set of features that can be generated to describe the relationship between an entity and a tweet. We explored different learning algorithms as well as, different types of features: text, keyword similarity scores between enti-ties metadata and tweets, Freebase entity graph and Wikipedia. The test set of the competition comprises more than 90000 tweets of 61 entities of four distinct categories: automotive, banking, universities and music. Results show that our approach is able to achieve a Reliability of 0.72 and a Sensitivity of 0.45 on the test set, corresponding to an F-measure of 0.48 and an Accuracy of 0.908.