Name		Name	Last commit message	Last commit date
parent directory ..
Bonusdata_Helsinkimap.png		Bonusdata_Helsinkimap.png
Bonusdata_ShopCategory-Time.png		Bonusdata_ShopCategory-Time.png
K-data_OCR.sh		K-data_OCR.sh
README.md		README.md
S-data_OCR.sh		S-data_OCR.sh
bonusdata_process.R		bonusdata_process.R

README.md

Shopping data

Watch first the video animation Kanta-asiakkuuden jäljet!

How to get my shopping data?

S-group: Fill in a form in a customer service desk.

K-group: Fill in a paper form and mail it to X (sorry forgot the details, trying to search for the link).

Both S and K will send you your data in a paper format via mail. This makes processing the data much harder but not impossible. In the future the data will hopefully be provided in a convenient machine readable format.

Some news about the data here and here

How to analyze my shopping data?

In short, you need to

Scan your data
Use some Optical character recognition (OCR) tool to convert scanned data into a machine readable format
Process and analyse the converted data

Here's an example workflow that worked for me

Scan your data into a PDF
Use Tesseract for OCR
- Tesseract shell scripts: S-data, K-data
Use R for processing and analysing the data
- R script with a lot of different processing stages
More details in the end of this page!

See the video animation Kanta-asiakkuuden jäljet!

Here are also some visualizations of the data:

More details of the tools used

Some tips and details of installing and using the tools on OSX 1.8.5.

OCR with Tesseract

Installation

Useful instructions here
Additionally, the Finnish language pack is needed
PDFTK also needed
This blog post was helpful for getting started with Tesseract

Running OCR

If data is given in a table format with borders, OCR will be in trouble. There might be some option for Tesseract to adapt to this, but at least I didn't find anything. So I ended up removing the horizontal lines in R, which was not trivial since the lines were not exactly horizontal but a bit tilted instead
It would have also been useful too add custom vocabulary such as "supermarket", but I did not get this to work with Tesseract (some hints here)

Animation

Used R package animation
Needs ffmpeg
Needed to also install otool

Data sonification

Used R package playitbyr
Needs Csound, installing instructions here
Note! playitbyr does not work with Csound 6, so install version 5 instead!

test edit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shopping

shopping

README.md

Shopping data

Watch first the video animation Kanta-asiakkuuden jäljet!

How to get my shopping data?

How to analyze my shopping data?

More details of the tools used

OCR with Tesseract

Animation

Data sonification

Files

shopping

Directory actions

More options

Directory actions

More options

Latest commit

History

shopping

Folders and files

parent directory

README.md

Shopping data

Watch first the video animation Kanta-asiakkuuden jäljet!

How to get my shopping data?

How to analyze my shopping data?

More details of the tools used

OCR with Tesseract

Animation

Data sonification