WELCOME SY TO YOUR NEW CONFLUENCE SPACE 
Title | Achievements | Challenge | Description | How they used EOSC-hub services | How EOSC-hub helped | Logos | Subtitle | The value proposal of the pilot | Title | Work Plan |
---|---|---|---|---|---|---|---|---|---|---|
BI Insight |
| The growing resources of all kinds of electronic documents, especially in large organizations, government institutions and administration, lead to the search for effective methods of working with such documents, their quick search, classification and full use of the information contained therein. To meet the challenge of improving work in the area of sharing knowledge and information collected in the organization, we designed and implemented a solution that facilitates access to unlimited resources of knowledge accumulated in the form of unstructured documents, stored in local resources and private and public clouds. A distinguishing feature of the solution is the mechanisms of automatic indexing and sending of documents as well as intelligent data searching. Thanks to this, our system can quickly, accurately and efficiently search for the most desired information, combining it with publicly available public registers and Wiki resources. A user-friendly search engine using artificial intelligence mechanisms allows you to accelerate the process of obtaining information and optimize work in the organization, increasing personal productivity and efficiency of information circulation processes. The use of these features in scientific and academic environments can significantly contribute to accelerating the development of science, innovation, and discovery. | BI Insight S.A. is a Polish company operating in the market since 2006. It specializes in solutions combining Business Intelligence, Artificial Intelligence and Big Data technologies as well as best practices in data management. BI Insight has many years of experience in natural language processing (NLP), closely cooperates in the field with leading academic centers and industry experts and is one of the leaders of this type of solution on the Polish market. BI Insight has created a system enabling users to access the knowledge contained in artifacts: presentations, text documents, sheets and others. The system utilizes Natural Language Processing and Machine Learning algorithms in creating recommendations, document classification, information retrieval (both from text and images embedded in documents), as well as building intelligent summaries. The bi ECM system won the first prize in the GOVTECH 2019 competition and became a finalist of the IT Future Awards 2019 competition and has been successfully implemented at the Ministry of Development and is used there by about 150 users. | -- To be completed when the project is finished | -- To be completed when the project is finished | Access the knowledge contained in artefacts: presentations, text documents, sheets and others. | -- To be completed when the project is finished | BI Insight: Business Intelligence, Artificial Intelligence and Big Data technologies. | As part of the pilot, it is planned to offer its functionality to a wide range of potential users from academia, industrial and governmental entities as a service. Deployment on Deep infrastructure is an opportunity to verify the platform capability to support the system in machine learning tasks. DEEP-Hybrid-DataCloud will provide execution environment for such tasks: services for learning models and their operationalization with the EOSC DIH offering cloud resources, integration support and consultancy as well as serving the wider dissemination channel. Finally, use of the EOSC marketplace to present and offer clients our ECM management product knowledge accumulated in text documents once validated. | |
NetService | -- To be completed when the project is finished | Original records can be manipulated and falsified by officials or black-market forgers. It is also not difficult in most cases to create realistic looking replicas of official documents which contain false information. The major challenge is that a paper-based document is used to transmit some kind of information and identity to the bearer. Because these documents are easy to forge or can be based on real, but stolen documents, they convey significant privileges to the bearer with only a small risk of exposure. In a blockchain-based system, paper-based documents are replaced with digital documents on an immutable ledger. The immutable nature of the blockchain means that these digital documents are impossible to duplicate or forge because there is only a unique, single record. | The aim of the pilot is to address the possibility for public institutions to issue valid official documents in a digital form, on the blockchain. The proposed architecture is based on a permissioned blockchain (Ethereum Proof of Authority, or similar). This blockchain can be obtained, at project level and possibly within a commercial version of the product, via an authentication service from a Certification Authority of the EUTSL list, or the AAI service provided by EOSC-hub project such as Check-In or B2Access. The pilot will look to demonstrate that the solution can be deployed on a federated infrastructure such as the EOSC along with cloud service support. | -- To be completed when the project is finished | -- To be completed when the project is finished | **Needed** | -- To be completed when the project is finished | NetService: Blockchain for university certificates | In the context of the EOSC-hub project, EOSC DIH partners and Net Service will carry out the following tasks:
| |
DCP | -- To be completed when the project is finished | To address the challenge of providing researchers with sufficient and cost-effective computing resources, Kings Distributed Systems is deploying the Distributed Compute Protocol (DCP), a cross-platform solution that aggregates computing resources from arbitrary devices and digital infrastructure - from smartphones to enterprise servers - and makes it available to researchers and innovators on-demand. DCP would allow both individual institutions as well as federated infrastructures, such as the EOSC, to recapture and allocate underutilized resources, while providing a credit-based accounting system to quantify usage of processing, bandwidth, and storage resources. The company holds the vision that the Distributed Compute Protocol becomes the multi-platform standard for distributed and edge computing. Kings Distributed Systems is facilitating access to limitless computing resources to accelerate science, innovation and discovery. Overall, this pilot aims to not only test, but showcase the applicability and value of such a solution for the European Open Science Cloud. | Kings Distributed Systems is a Canadian company deploying the Distributed Compute Protocol (DCP), a web platform that aggregates excess computing power from underutilized devices and digital infrastructure and makes it available to researchers and innovators. Their Compute API allows users to trivially express parallel workloads, e.g. Advanced Research Computing, AI/ML, blockchain, mathematical finance. The Protocol automatically distributes those workloads for computation. | -- To be completed when the project is finished | -- To be completed when the project is finished |
| Automating resource allocation and multi-metric accounting in a federated digital marketplace | -- To be completed when the project is finished | DCP: dynamic resource allocation and accounting in a digital marketplace |
|
BBC R&D | -- To be completed when the project is finished | Audiences are consuming more and more video, demanding increasingly higher quality, using a variety of devices including TVs, smartphones, tablets and computers. This is why video compression standards are needed, which allow compressed content to be distributed and then decoded by anyone – ready to be displayed on the device of choice. In this context, research is supported by H2020 Marie Sklodowska-Curie ETN grant JOLT and UK’s EPSRC iCASE grants where researchers are also enrolled to PhD programmes at Dublin City University and Queen Mary University of London. To enable their research, access to adequate computational facilities is needed. The use of large-scale processing resources have the capabilities to transform how content providers obtain, produce and deliver content in challenging scenarios. A move away from expensive bespoke broadcast specific facilities and hardware to more commoditised scalable-cloud based resources will enable providers to more efficiently manage its content compared to what has traditionally been achievable. | The video coding team within BBC R&D focuses on multiple aspects of video technology, with the general goal of supporting the delivery of high-quality content to all BBC audiences. In addition to performing core fundamental research on video compression standards, the video coding team is researching new, advanced ways of performing compression based on machine learning, artificial intelligence and content analytics, while also applying our findings to enable new content experiences. | -- To be completed when the project is finished | -- To be completed when the project is finished | Transforming video content through compression and large-scale processing | -- To be completed when the project is finished | BBC R&D: video coding and compression |
| |
KAMPAL | Kampal Data Solutions has developed a machine learning model able to cope with big data samples by using the cloud infrastructure provided by EOSC DIH. The case study was based on a medical data set provided by the FEETEG containing information of patients with Gaucher’s disease. In addition, extra data was generated following what the current literature considers normal values. This way allowed to obtain a big data sample that loosely resembles the natural proportion of patients with Gaucher Disease. To be able to handle the problem size increment the parallelization of the code was required, benefiting from cloud computing. | Due to the fact that Gaucher disease is a rare disease with few national registries, the computational power of a local computer for the study of correlations with other diseases was enough to analyse the data collected. The challenge now is to generate a new model able to predict if a person has the probability of developing Gaucher disease. In this case, the AI model must include not only data from current Gaucher disease patients but also data from healthy patients. Opening our sample universe also to healthy patients exponentially increases the sample size (from hundreds to millions) and potentially the model’s complexity. This implies the need of advanced computational resources such as the cloud platform provided by EOSC. Although this proof of concept is focused in Gaucher disease, the developed solution could be adapted in the future to other diseases data bases. The obtained general-purpose solution will be exploited by Kampal Data Solutions in the mid-term. | The Spanish Foundation for the Study and Treatment of Gaucher Disease and other Lysosomal Diseases (FEETEG) promotes the scientific research of Gaucher disease and its treatment methods. The Foundation is interested in predicting the probability of development of diseases such as neoplasms or Parkinson’s disease in patients of Gaucher disease (correlations between diseases). For this purpose, Kampal Data Solutions was contacted by FEETEG to develop an advanced analytical model based on Artificial Intelligence with the information available in the Gaucher Spanish Disease Registry. | The pilot required extra computational resources to cope with the problem size (1 million samples). For that, Kampal Data Solutions got benefit from the EOSC DIH cloud infrastructure where 16 VCPUs with 32GB of RAM were used. To speed up the process and benefit from all the cores, the code was parallelized. This way, different operations can be done simultaneously on each core using only a fraction on the sequential computational time. The parallelization of the code was greatly simplified by using the R packages parallel and dplyr. | EOSC-hub has provided Kampal Data Solutions with powerful cloud infrastructure to support the scaled up analytics required for validating the proof of concept. Using the computing power of the EOSC-hub services, Kampal Data Solutions could experiment and test its new models for the disease prediction. The technical support provided from the EOSC DIH team helped Kampal to access and manage the Cloud and provided a better understanding of the EOSC computing infrastructure, meanwhile the visibility service enhance the exposure of the pilot through different European communities. | Assessing the probability of development of further diseases in Gaucher disease patients | Although the obtained results do not have medical value, this proof of concept shows that the chosen model is scalable and could be efficiently applied to other conditions or illnesses where more data is available. The challenge now will be identify the business opportunities to exploit the model. | Kampal: Artificial Intelligence for rare disease diagnosis | In the context of the EOSC-hub project, Kampal Data Solutions will develop the following tasks:
|