Ttitle: Distributed learning with information streams
This is a 48 months Doctoral scholarship at Universidade NOVA Lisboa.
Scope: Nowadays, streams of Web user data are mostly discarded by current Web information systems. User location, devices, services and other sensors hide specific information consumption patterns that could be identified by online services to better answer consumer needs. Most of this data is only useful during a short period of time and is related to short-lived events, far shorter than the time a batch and non-distributed data mining algorithm needs to timely process large-scale data.
Learning in large-scale non-stationary environments is a major challenge. Learning algorithms need to cope with the constant arrival of large training data and data that is distributed across multiple locations. Loss functions become a trade-off between the training error rate and the delay of having to iterate over distributed data. The applicant will investigate a novel breed of learning methods for large-scale distributed environments.
This scholarship is offered in the context of the GoLocal project.
Recruitment: The applicant should be fluent in English and have a good background in data science and distributed data architectures for BigData processing (Apache Spark and Hadoop). Experience in programming languages: Python, Java, C++.