Alexandra Kapp @lxndrkp X Profile

Alexandra Kapp

@lxndrkp

Followers

761

Following

3K

Media

122

Statuses

770

https://t.co/FrCnaYuF0u

Berlin, Deutschland

Joined February 2015

Don't wanna be here? Send us removal request.

Alexandra Kapp

@lxndrkp

4 years

I finally officially registered as a Ph.D. student researching 'privacy-preserving analytics of human mobility data applications'. 🎉 Using this occasion, I started a blog where I want to share insights on my work along the way: https://t.co/DdihnKNpUO

3

1

63

Nachhaltigkeitsforschung gestalten

@soef_BMBF

1 year

Tolle Website vom @BMBF_Bund geförderten Projekt #freemove! Tauchen Sie einfach ein in die Welt der Daten! #freemove hilft beim Spagat zwischen Daten nutzen und Privatsphäre schützen - für mehr #Nachhaltigkeit! @TSBBerlin @TUBerlin @FU_Berlin @DLR_Verkehr https://t.co/ZWCkJCdp6j

fona.de

Die Mobilitätswende ist ein entscheidender Hebel, wenn es um die nachhaltige Stadt von morgen geht. Mobilitätsdaten werden als Lösung für viele Herausforderungen bei genau dieser Wende gehandelt – ob...

0

4

5

Alexandra Kapp

@lxndrkp

2 years

It is worth noting that trip data encompasses various relevant characteristics beyond spatial distribution (e.g. temporal info) all of which are discarded by these models. > Our results imply that current models fall short in their promise of high utility and flexibility. 13/13

0

Alexandra Kapp

@lxndrkp

2 years

The remaining 3 models somewhat maintain spatial distribution, one even with differential privacy guarantees. However, all models struggle to produce meaningful sequences of geo-locations with reasonable trip lengths and to model traffic flow at intersections accurately. /12

1

0

Alexandra Kapp

@lxndrkp

2 years

Out of the five evaluated models, one fails to produce data within reasonable computation time and another generates too many jumps to meet the requirements for map matching. /11

1

0

Alexandra Kapp

@lxndrkp

2 years

Then, we introduced routing-engine-generated trips (like GoogleMaps) as a baseline, as they provide a privacy-friendly way of fine-granular routes to connect a start and an endpoint. /10

1

0

Alexandra Kapp

@lxndrkp

2 years

Firstly, none of the 5 evaluated models provide synthetic data on a level that is fine-granular enough to match the road network. Thus, we included a step of map matching. /9

1

0

Alexandra Kapp

@lxndrkp

2 years

We evaluated the utility of five state-of-the-art models, AdaTrace, PrivTrace, DP-Loc, a BiLSTM-based model, and TrajGAIL, using the designated utility metrics on a dataset comprising approximately 30,000 bicycle trips in Berlin. /8

1

0

Alexandra Kapp

@lxndrkp

2 years

Thus, we selected 4 tasks that closely reflect real-life tasks that trip data is used for to obtain a more realistic utility evaluation: trip lengths, traffic volume, road preference, and traffic flow at intersections. /7

1

0

Alexandra Kapp

@lxndrkp

2 years

Also: high similarity based on one distribution does not indicate a general high utility. For example, high similarity of spatial distributions does not allow conclusions about temporal distributions. Single distributions also do not reflect actual real-life use cases. /6

1

0

Alexandra Kapp

@lxndrkp

2 years

Distributions are typically discretized, e.g., a spatial distribution based on a grid. The resolution of such grids thereby highly influences the conclusion about the maintained utility: a high similarity on a 100 m res. has different implications than on a 1km res. /5

1

0

Alexandra Kapp

@lxndrkp

2 years

How is utility measured? Typically, such synthetic data models are evaluated by comparing distributions, e.g., the spatial distribution, between raw and synthetic data. The higher the similarity the higher the utility. However, this approach has shortcomings: /4

1

0

Alexandra Kapp

@lxndrkp

2 years

Synthetic data, in this context, is created through models that learn respective distributions from raw data and maintain these. The goal is to create high-utility privacy-friendly synthetic datasets. /3

1

0

Alexandra Kapp

@lxndrkp

2 years

Why synthetic data? Human movement data is highly sensitive, however, data sharing is desirable for many use cases, including city planning or demand /2

1

0

Alexandra Kapp

@lxndrkp

2 years

📢New paper📢 We investigated the utility of five models that create synthetic urban mobility data from raw privacy-sensitive data. >synthetic trips do not provide the expected high flexibility and utility and should be used carefully. @h_mihaljevic https://t.co/c7yEYTkX4u 🧵/1

dl.acm.org

1

Alexandra Kapp

@lxndrkp

2 years

Was genau ist explainable AI und wie funktioniert es? Das habe ich in einem Betrag für https://t.co/Nb6BkAiIgX zusammengefasst 🤓

te.ma

te.ma präsentiert zentrale Beiträge aus Fachdiskursen und macht brennende Fragen fundiert diskutierbar.

0

3

Robin Lovelace

@robinlovelace

2 years

New #geocompx blog post on Geographic Data Analysis in #RStats and #Python. The first time equivalent code for reading, plotting, and analysing geographic vector data in these two popular #DataScience languages are provided side-by-side 🚀 #OpenSource: https://t.co/6xnavxT1YW

3

58

194

Helena Mihaljevic

@h_mihaljevic

2 years

Wir suchen ab Oktober eine*n wissenschaftliche*n Mitarbeiter*in im Bereich #DeepLearning , #ComputerVision, Bildklassifikation, und Geo-Daten. Ziel: Verknüpfung offener Datenquellen, um möglichst präzise Art und Qualität von Straßenbelag vorherzusagen.

1

7

4

Alexandra Kapp

@lxndrkp

2 years

For better usability for practitioners, models should provide clearer information on applicable use cases, input and output format, maintained (and discarded) empirical distributions, and required dataset size. Source code and example datasets should be made openly available.

0

Alexandra Kapp

@lxndrkp

2 years

Summary: so far such models for mobility data are not yet ready to use in practice. And more research with real-world test cases would be desirable.

1

0