Past

1. SmartDataLake: Sustainable DataLakes for Extreme-Scale Analytics

https://smartdatalake.eu/ SmartDataLake aims at designing, developing and evaluating novel approaches, techniques and tools for extreme-scale analytics over Big Data Lakes. It tackles the challenges of reducing costs and extracting value from Big Data Lakes by providing solutions for virtualized and adaptive data access; automated and adaptive data storage tiering; smart data discovery, exploration and mining; monitoring and assessing the impact of changes; and empowering the data scientist in the loop through scalable and interactive data visualizations.

2. Near Data Processing - Collaboration with Huawei

Traditional compute-centric data processing systems suffer from excessive data movement costs. The recent trend of decoupling compute and storage in favor of scalability and resilience further exacerbates data transfer overheads. This project enables a data-centric architecture that adaptively offloads computation to storage nodes and elastically adjusts computational resources in order to eliminate funnel bottlenecks.

3. Batch Query Optimization - Collaboration with Huawei

This project proposes a system that achieves high throughput and resource efficiency in multi-tenant infrastructures by sharing data and operators among different concurrent queries. The conducted work lies in three axes: (a) design and implementation of sharing-aware analytical operators for a well-known open-source analytical engine, (b) algorithms on multi-query optimization, and (c) locality-aware data placement and partitioning.

4. E2Data: European Energy-efficient Big Data Stacks

https://e2data.eu/ E2Data proposes an end-to-end solution for Big Data deployments that will fully exploit and advance the state-of-the-art in infrastructure services by delivering more performance from fewer resources. The E2Data stack will achieve this by dynamically profiling, compiling and optimising code for execution on chosen devices, such as CPUs, GPUs, FPGAs, and others. By removing the need for developers to craft specific device code in languages like CUDA or OpenCL, E2Data will create substantial savings in developer time, while still exploiting the power of diverse device architectures, such as are currently offered by Microsoft, Amazon and others.

5. Data Sovereignty through the use of Blockchain - (ΕΔΒΜ34:Support for young researchers)

The project proposes the combination of Cloud Computing and Blockchain Technology into a platform that enables the on-demand creation of a fully distributed, scalable and secure virtual infrastructure for data storage and processing. One of the main targets of the project is to face the limitations of current blockchain implementations in terms of required storage, latency and throughput by adopting different architectural choices that will render blockchain usable in the context of Cloud Computing.

6. A Scalable Analytics Platform (ASAP)

http://www.asap-fp7.eu/ The ASAP FP7 research project develops a dynamic open-source execution framework for scalable data analytics. The underlying idea is that no single execution model is suitable for all types of tasks, and no single data model (and store) is suitable for all types of data. Complex analytical tasks over multi-engine environments therefore require integrated profiling, modeling, planning and scheduling functions. The project has four goals:
  • A general-purpose task-parallel programming model and a runtime system to execute it in the cloud. The runtime will incorporate and advance state-of-the-art task-parallel programming models features: irregular general-purpose computations, resource elasticity,synchronization, data-transfer, locality and scheduling abstraction, ability to handle large sets of irregular distributed data and fault-tolerance.
  • A modeling framework that constantly evaluates the cost, quality and performance of data and computational resources in order to decide on the most advantageous store, indexing and execution pattern available.
  • A unique adaptation methodology that will enable the analytics expert to amend the task she has submitted at an initial or later stage.
  • A state-of-the-art visualization engine that will enable the analytics expert to obtain accurate, intuitive results of the analytics tasks she has initiated in real-time.

7. CLARIN-EL: Common Language Resources and Technology Infrastructure

http://www.clarin.gr/ CLARIN-EL is the Greek counterpart of the CLARIN project, a pan-european effort for the collection and the distribution to the research community of language resources (text/speech/multimodal corpora, lexica, terminological glossaries etc.) in all languages and the relevant language processing tools (morphological/syntactic analysers, parsers, taggers, statistical tools etc.) through a web-based Research Infrastucture.

8. CELAR: Automatic, multi-grained elasticity-provisioning for the Cloud

http://www.celarcloud.eu/ Auto-scaling resources is one of the top obstacles and opportunities for Cloud Computing: consumers can minimize the execution time of their tasks without exceeding a given budget; cloud providers maximize their financial gain while keeping their customers satisfied and minimizing administrative costs. Many systems claim to offer adaptive elasticity, yet the “throttling” is usually performed manually, requiring the user to figure out the proper scaling conditions. In order to harvest the benefits of elastic provisioning, it is imperative that it be performed in an automated, fully customizable manner. CELAR delivers a fully automated and highly customizable system for elastic provisioning of resources in cloud computing platforms.

9. MoDisSENSE: A Distributed Platform for the Development of Social Networking Services over Mobile Devices

http://www.modissense.gr MoDisSENSE enriches social networking services by exploiting the continuous data flow from the daily use of mobile phones. This flow includes data from user visited locations, contacts, calls and calendar combined with data acquired from the user’s social network (list of friends, profile and preferences). The project combines these heterogeneous data sources (geographic and social log files, user profiles and preferences and context information) and offers innovative services based on advanced searches and combined queries that exploit all these aforementioned sources by utilizing state-of-the art distributed data processing techniques. Furthermore, the project deals with the development and deployment of services that exploit spatiotemporal data generated by user paths.


Contact

Ioannis Mytilinis is a Software Engineer @ MySQL HeatWave, Oracle.

Email:
Google Scholar | CV