Efficient UD(A)Fs with PySpark

Nowadays, Spark surely is one of the most prevalent technologies in the fields of data science and big data. Luckily, even though it is developed in Scala and runs in the Java Virtual Machine (JVM), it comes with Python bindings also known as PySpark, whose API was heavily influenced by …

more ...

Declarative Thinking and Programming

Declarative Programming is a programming paradigm that focuses on describing what should be computed in a problem domain without describing how it should be done. The post starts by explaining differences between a declarative and imperative approach with the help of examples from everyday life.

more ...

“Which car fits my life?” - mobile.de’s approach to recommendations

At mobile.de, Germany’s biggest car marketplace, a dedicated team of data engineers and scientists, supported by the IT project house inovex is responsible for creating intelligent data products. Driven by our company slogan “Find the car that fits your life”, we focus on personalised recommendations to address several …

more ...

Causal Inference and Propensity Score Methods

In the field of machine learning and particularly in supervised learning, correlation is crucial to predict the target variable with the help of the feature variables. Rarely do we think about causation and the actual effect of a single feature variable or covariate on the target or response. Some even …

more ...