Performance evaluation of GANs in a semi-supervised OCR use case

Even in the age of big data labelled data is a scarce resource in many machine learning use cases. We evaluate generative adversarial networks (GANs) at the task of extracting information from vehicle registrations under a varying amount of labelled data and compare the performance with supervised learning techniques. Using unlabelled data shows a significant improvement.

more ...

Multiplicative LSTM for sequence-based Recommenders

Recommender Systems support the decision making processes of customers with personalized suggestions. They are widely used and influence the daily life of almost everyone in different domains like e-commerce, social media, or entertainment. Quite often the dimension of time plays a dominant role in the generation of a relevant recommendation.

more ...

Bridging the Gap: from Data Science to Production

A recent but quite common observation in industry is that although there is an overall high adoption of data science, many companies struggle to get it into production. Huge teams of well-payed data scientists often present one fancy model after the other to their managers but their proof of concepts …

more ...


Managing isolated Environments with PySpark

The Spark data processing platform becomes more and more important for data scientists using Python. PySpark - the official Python API for Spark - makes it easy to get started but managing applications and their dependencies in isolated environments is no easy task.

more ...