Skip to content
This repository has been archived by the owner on Jan 28, 2020. It is now read-only.

Commit

Permalink
Script to generate parties and party sets for Costa Rica
Browse files Browse the repository at this point in the history
Uses the parties.csv file in elections/cr/data/ to generate the parties
and party sets for the 2016 elections.

It generates a party set for each Canton and then uses the list of
Cantons the party is standing in to add it to the appropriate party
sets.

It also adds the "Cédula Jurídica" ID as an identifier but uses the
slugified name as the party slug as we don't have complete data for the
"Cédula Jurídica".
  • Loading branch information
struan committed Jan 6, 2016
1 parent d1d9927 commit b9a472c
Show file tree
Hide file tree
Showing 2 changed files with 182 additions and 0 deletions.
57 changes: 57 additions & 0 deletions elections/cr/data/parties.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
Accesibilidad Sin Exclusión (PASE),3-110-420985,"Escazú, Desamparados, Goicoechea, Santa Ana, Curridabat, Perez Zeledón, Alajuela, Grecia, Palmares, Cartago, Paraíso, La Unión, Jiménez, Heredia, Liberia, Puntarenas, Esparza, Quepos, Pococí, Siquirres, Matina (21/81)"
Acción Ciudadana (PAC),3-110-301964 ,"San José, Escazú, Puriscal, Tarrazú, Aserrí, Mora, Goicoechea, Santa Ana, Alajuelita, Vásquez de Coronado, Acosta, Moravia, Dota, Curridabat, Perez Zeledón, Leon Cortés Castro, Alajuela, San Ramón, Grecia, San Mateo, Atenas, Naranjo, Palmares, Poás, Orotina, San Carlos, Valverde Vega, Upala, Los Chiles, Cartago, Paraíso, La Unión, Jiménez, Turrialba, Alvarado, Oreamuno, El Guarco, Heredia, Barva, Santo Domingo, Santa Barbara, San Rafael, San Isidro, Flores, San Pablo, Sarapiquí, Liberia, Nicoya, Santa Cruz, Bagaces, Carrillo, Cañas, Abangares, Tilarán, Nandayure, La Cruz, Hojancha, Puntarenas, Esparza, Buenos Aires, Montes de Oro, Osa, Quepos, Golfito, Coto Brus, Parrita, Corredores, Garabito, Pococí, Siquirres, Talamanca, Matina, Guácimo (73/81)"
Frente Amplio (FA),3-110-410964,"San José, Escazú, Desamparados, Puriscal, Tarrazú, Aserrí, Mora, Goicoechea, Alajuelita, Vásquez de Coronado, Moravia, Dota, Curridabat, Perez Zeledón, Alajuela, San Ramón, Grecia, San Mateo, Atenas, Naranjo, Palmares, Poás, San Carlos, Upala, Los Chiles, Cartago, Paraíso, La Unión, Jiménez, Turrialba, Oreamuno, El Guarco, Heredia, Barva, Santa Barbara, San Rafael, San Isidro, Flores, San Pablo, Sarapiquí, Liberia, Nicoya, Santa Cruz, Cañas, Abangares, Nandayure, Puntarenas, Esparza, Buenos Aires, Montes de Oro, Osa, Golfito, Corredores, Limón, Pococí, Siquirres, Talamanca, Matina, Guácimo (59/81)"
Integración Nacional (PIN),3-110-212993,"San José, Tibás, Naranjo, Santa Cruz, Abangares, Puntarenas, Osa, Quepos (8/81)"
Liberación Nacional (PLN),3-110-051854,"San José, Escazú, Desamparados, Puriscal, Tarrazú, Aserrí, Mora, Goicoechea, Santa Ana, Alajuelita, Vásquez de Coronado, Acosta, Tibás, Moravia, Montes de Oca, Turrubares, Dota, Curridabat, Perez Zeledón, Leon Cortés Castro, Alajuela, San Ramón, Grecia, San Mateo, Atenas, Naranjo, Palmares, Poás, Orotina, San Carlos, Zarcero, Valverde Vega, Upala, Los Chiles, Guatuso, Cartago, Paraíso, La Unión, Jiménez, Turrialba, Alvarado, Oreamuno, El Guarco, Heredia, Barva, Santo Domingo, Santa Barbara, San Rafael, San Isidro, Belén, Flores, San Pablo, Sarapiquí, Liberia, Nicoya, Santa Cruz, Bagaces, Carrillo, Cañas, Abangares, Tilarán, Nandayure, La Cruz, Hojancha, Puntarenas, Esparza, Buenos Aires, Montes de Oro, Osa, Quepos, Golfito, Coto Brus, Parrita, Corredores, Garabito, Limón, Pococí, Siquirres, Talamanca, Matina, Guácimo (81/81)"
Movimiento Libertario (ML),3-110-200226,"San José, Escazú, Desamparados, Puriscal, Tarrazú, Aserrí, Mora, Goicoechea, Santa Ana, Alajuelita, Vásquez de Coronado, Tibás, Moravia, Montes de Oca, Turrubares, Dota, Curridabat, Perez Zeledón, Leon Cortés Castro, Alajuela, San Ramón, Grecia, San Mateo, Atenas, Naranjo, Palmares, Zarcero, Upala, Los Chiles, Guatuso, Cartago, Paraíso, La Unión, Oreamuno, El Guarco, Heredia, Barva, Santo Domingo, Santa Barbara, San Rafael, San Isidro, Belén, San Pablo, Sarapiquí, Liberia, Santa Cruz, Bagaces, Carrillo, Abangares, Tilarán, Puntarenas, Esparza, Buenos Aires, Montes de Oro, Osa, Quepos, Golfito, Coto Brus, Parrita, Corredores, Garabito, Limón, Pococí, Siquirres, Talamanca, Matina, Guácimo (67/81)"
Nueva Generación (PNG),3-110-671418 ,"San José, Desamparados, Puriscal, Aserrí, Mora, Goicoeachea, Santa Ana, Alajuelita, Tibás, Moravia, Montes de Oca, Turrubares, Dota, Curridabat, Perez Zeledón, Alajuela, San Ramón, Palmares, Poás, Orotina, San Carlos, Valverde Vega, Guatuso, Cartago, Paraíso, La Unión, Turrialba, Alvarado, Heredia, Santo Domingo, Santa Barbara, San Isidro, San Pablo, Sarapiquí, Liberia, Nicoya, Santa Cruz, Bagaces, Tilarán, Nandayure, Puntarenas, Buenos Aires, Quepos, Pococí, Siquirres (45/81)"
Partido de los Trabajadores (PT), 3-110-071063,"San José, Curridabat, Alajuela, Naranjo, Los Chiles, Puntarenas, Limón, Pococí (8/81)"
Renovación Costarricense (PRC), 3-110-190890,"Escazú, Desamparados, Santa Ana, Alajuelita, Tibás, Moravia, Curridabat, Alajuela, San Ramón, Grecia, Naranjo, Palmares, Poás, Orotina, San Carlos, Los Chiles, Paraíso, Heredia, Barva, Liberia, Sarapiquí, Osa, Quepos, Golfito, Parrita, Corredores, Garabito, Limon, Pococí, Siquirres, Talamanca, Matina, Guácimo (33/81) "
Republicano Social Cristiano (PRSC),,"San José, Escazú, Desamparados, Tarrazú, Goicoeachea, Santa Ana, Alajuelita, Vásquez de Coronado, Tibás, Turrubares, Curridabat, Perez Zeledón, Alajuela, San Ramón, Grecia, San Mateo, Atenas, Naranjo, Palmares, Poás, Orotina, San Carlos, Zarcero, Upala, Los Chiles, Guatuso, Cartago, Paraíso, La Unión, Turrialba, Alvarado, Oreamuno, El Guarco, Heredia, Barva, Santo Domingo, Santa Barbara, San Rafael, San Isidro, Belén, Flores, San Pablo, Sarapiquí, Liberia, Nicoya, Santa Cruz, Bagaces, Carrillo, Abangares, Tilarán, Nandayure, Hojancha, Puntarenas, Esparza, Buenos Aires, Quepos, Coto Brus, Parrita, Corredores, Garabito, Limon, Pococí, Siquirres, Talamanca, Matina, Guácimo (66/81)"
Restauración Nacional (PREN),3-110-419368 ,"Desamparados, Goicoechea, Moravia, Montes de Oca, Curridabat, Pérez Zeledón, Alajuela, San Ramón, Poás, Orotina, Cartago, Alvarado, Heredia, Barva, Flores, San Pablo, Sarapiquí, Golfito (18/81)"
Unidad Social Cristiana (PUSC), 3-110-098296,"San José, Escazú, Desamparados, Puriscal, Tarrazú, Aserrí, Mora, Goicoechea, Santa Ana, Alajuelita, Vásquez de Coronado, Acosta, Tibás, Moravia, Montes de Oca, Turrubares, Dota, Curridabat, Perez Zeledón, Alajuela, San Ramón, Grecia, San Mateo, Atenas, Naranjo, Palmares, Poás, Orotina, San Carlos, Zarcero, Valverde Vega, Los Chiles, Guatuso, Cartago, Paraíso, La Unión, Jiménez, Turrialba, Oreamuno, El Guarco, Heredia, Barva, Santo Domingo, Santa Barbara, San Rafael, San Isidro, Belén, Flores, San Pablo, Sarapiquí, Liberia, Nicoya, Santa Cruz, Bagaces, Carrillo, Cañas, Abangares, Tilarán, La Cruz, Hojancha, Puntarenas, Esparza, Buenos Aires, Montes de Oro, Osa, Quepos, Golfito, Coto Brus, Parrita, Corredores, Garabito, Limon, Pococí, Siquirres, Talamanca, Matina, Guácimo (78/81)"
Alianza Demócrata Cristiana (Cartago),,"Cartago, Paraíso, La Unión, Jiménez, Turrialba, Alvarado, Oreamuno, El Guarco (8/81)"
Recuperando Valores (Limón),,"Limon, Siquirres, Guácimo (3/81)"
Verde (Cartago), 3-110-421552,"Cartago, Paraíso, La Unión (3/81)"
Viva Puntarenas (Puntarenas),3-110-674493,"Puntarenas, Buenos Aires, Osa, Quepos, Golfito, Coto Brus (6/81)"
Acción Cantonal Siquirres Independiente,3-110-359256,Siquirres (1/81)
Acuerdo de Alianza de Quepos ,,Quepos (1/81)
Alianza Cristiana Santaneña ,,Santa Ana (1/81)
Alianza por Palmares ,,Palmares (1/81)
Alianza por San José ,3-110-442583,San José (1/81)
Alianza Sancarleña ,,San Carlos (1/81)
Alianza Social por La Unión ,,La Unión (1/81)
Auténtico Labrador de Coronado ,,Vasquez de Coronado (1/81)
Auténtico Limonense ,,Limón (1/81)
Auténtico Siquirreño,,Siquirres (1/81)
Autónomo Oromontano,3-110-580805,Montes de Oro (1/81)
Avance Montes de Oca ,3-110-668203,Montes de Oca (1/81)
Barva Unida ,,Barva (1/81)
Cívico de Tibás Fuenteovejuna ,3-110-606664,Tibás (1/81)
Curridabat Siglo XXI ,3-110-426881,Curridabat (1/81)
Del Sol ,3-110-603639,Santa Ana (1/81)
Demócrata ,,Vasquez de Coronado (1/81)
Desamparados Unido ,,Desamparados (1/81)
Ecólogico Comunal Costarricense ,,Desamparados (1/81)
Fuerza Democrática Desamparadeña ,,Desamparados (1/81)
Fuerzas Unidas para el Cambio ,,San José (1/81)
Garabito Ecológico ,,Garabito (1/81)
Independiente Belemita,3-110-605963,Belén (1/81)
Independiente Escazuceño,,Escazú (1/81)
Justicia Generaleña ,,Perez Zeledón (1/81)
Liga Ramonense ,,San Ramón (1/81)
Limón Independiente ,,Limón (1/81)
Movimiento Avance Santo Domingo,3-110-604535,Santo Domingo (1/81)
Nueva Mayoría Griega ,,Grecia (1/81)
Parrita Independiente ,,Parrita (1/81)
Progreso Comunal Desampareño ,,Desamparados (1/81)
Pueblo Garabito ,,Garabito (1/81)
Puriscaleños de Corazón ,,Puriscal (1/81)
Renovación Cartago,,Cartago (1/81)
Renovemos Alajuela,3-110-576965,Alajuela (1/81)
Rescate Cantonal La Unión,,La Unión (1/81)
Restauración Parriteña ,,Parrita (1/81)
Solidaridad ,,San José (1/81)
Todo por Flores,3-110-609027,Flores (1/81)
Unión Guarqueño,,El Guarco (1/81)
Yunta Progresista Escazuceña ,3-110-207588,Escazú (1/81)
125 changes: 125 additions & 0 deletions elections/cr/management/commands/cr_import_parties_from_csv.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# -*- coding: utf-8 -*-

import csv
import re

from django.core.management.base import BaseCommand
from django.core.exceptions import ObjectDoesNotExist

from django.utils.text import slugify

from candidates.models import PartySet, OrganizationExtra
from popolo import models as popolo_models
from popolo.importers.popit import PopItImporter


def fix_whitespace(s):
s = s.strip()
return re.sub(r'(?ms)\s+', '', s)

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

fix_whitespace is used on the names of parties, I think, but this strips out any run of spaces completely instead of replacing them with a single space - I guess that's not what you intend? Maybe this should be remove_whitespace and have a new fix_whitespace that reduces whitespace to a single space?

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

Oh, I see you've fixed this in a subsequent commit.



class Command(BaseCommand):
help = """Create or update parties from a CSV file

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

This needs to be a u""" string in order for the help to be displayed when running cr_import_parties_from_csv since it contains non-US-ASCII characters. Also the way long help strings are displayed is a bit unfortunate - they're word-wrapped so appear as:

 ./manage.py cr_import_parties_from_csv --help
usage: manage.py [-h] [--settings SETTINGS]

optional arguments:
  -h, --help           show this help message and exit
  --settings SETTINGS
usage: manage.py cr_import_parties_from_csv [-h] [--version] [-v {0,1,2,3}]
                                            [--settings SETTINGS]
                                            [--pythonpath PYTHONPATH]
                                            [--traceback] [--no-color]
                                            [--party-set PARTY_SET]
                                            CSV-FILENAME

Create or update parties from a CSV file Takes an argument of a CSV file which
should have 3 columns: 1: Name of party 2: ID of party (optional) 3: Comma
separated list of Cantons the party is standing in The ID is the "Cédula
Jurídica" and is added as an Identifier. It expects the CSV file to have NO
header row. It will create, or update, a party for each row in the CSV and a
party set for each Canton, adding the party to the party sets for the Cantons
in which the party is standing. This generates slugs by slugifying the party
name as we don't have the above IDs for all parties.

positional arguments:
  CSV-FILENAME

... so it might be worth reformatting it with that in mind. (I can't see an obvious way of changing the [formatter_class](https://docs.python.org/2/library/argparse.html#formatter-class) used by the management command.)

Takes an argument of a CSV file which should have 3 columns:
1: Name of party
2: ID of party (optional)
3: Comma separated list of Cantons the party is standing in
The ID is the "Cédula Jurídica" and is added as an Identifier.
It expects the CSV file to have NO header row.
It will create, or update, a party for each row in the CSV and
a party set for each Canton, adding the party to the party sets
for the Cantons in which the party is standing.
This generates slugs by slugifying the party name as we don't
have the above IDs for all parties.
"""

def add_arguments(self, parser):
parser.add_argument('CSV-FILENAME')
parser.add_argument('--party-set', default='default')

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

I think this --party-set argument should be removed.


def add_id(self, party, id):

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

An annoying thing is that calling the parameter id shadows the built-in id function, which obviously doesn't matter in this case, but it's probably very slightly better style to call this party_id instead ;)

related_data = {
"identifier": id,
"scheme": "Cédula Jurídica"

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

I think this should be a unicode, i.e. u"Cédula Jurídica" (it might work without, but I bet it'd create obscure bugs in the future :))

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

Actually, generally the schemes we've used for identifiers have looked less like human-readable strings and more like IDs - perhaps just make the scheme cedula-juridica instead?

}
try:
party.identifiers.get(**related_data)
except ObjectDoesNotExist:
party.identifiers.create(**related_data)

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

This body of this method could be:

party.identifiers.get_or_create({'identifier': id, 'scheme': u"Cédula Jurídica"})

... which I think should have the same semantics. However, is that the behaviour that you want, or should it update any existing identifier of the scheme? e.g. with

party.identifiers.update_or_create(scheme=u"Cédula Jurídica", defaults={'identifier': id})

def update_party(self, party_data):
name = fix_whitespace(party_data[0])

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

It looks as if some of the party names have initialisms in brackets at the end (although sometimes things in brackets that aren't initialisms?) - what I did for Burkina Faso was to add the abbreviation as an other_name of the Organization, which is what people suggested when I asked about this.

This comment has been minimized.

Copy link
@struan

struan Jan 7, 2016

Author Member

Some of them seem to be initialisms but some of them seem to be the name of the Canton in which the party is standing so given there was no consistency I decided it was better to remove it

party_id = fix_whitespace(party_data[1])

# remove the (13/81) information text from the end of
# the canton list.
canton_list = re.search(
r'^([^(]*)\(?',
fix_whitespace(party_data[2])
).group(1)
cantons = canton_list.split(',')
slug = slugify(name)

try:

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

I think that the check for whether the party already exists based should be based on the party_id rather than the name.

org = popolo_models.Organization.objects.get(
name=name
)
print "found existing party {0}".format(name)
except popolo_models.Organization.DoesNotExist:
org = popolo_models.Organization.objects.create(name=name)

OrganizationExtra.objects.create(
base=org, slug=slug

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

We should probably change the name of the slug field on OrganizationExtra and PostExtra classes to make this less confusing, but they're really the canonical ID rather than a slug - links to the party pages are normally generated with an ignored slug at the end, so the code here will result in some URLs like /election/.../party/accesibilidad-sin-exclusion-pase/accesibilidad-sin-exclusion-pase. I think the OrganizationExtra.slug should be party_id (or slugified party name where there is no Cédula Jurídica) so, where there is a proper ID, it'll be /election/.../party/3-110-420985/accesibilidad-sin-exclusion-pase.

(I think it would still be a good idea to add an Identifier with any Cédula Jurídica, even after switching as I suggested, incidentally.)

This comment has been minimized.

Copy link
@struan

struan Jan 7, 2016

Author Member

The main reason I used the slugified id was that it matched up with the Every Politician data. And also it meant that there was consistency in the IDs.

)
print "created new party {0}".format(name)

if party_id != '':
self.add_id(org, party_id)

for canton in cantons:
party_set_slug = "2016_canton_{0}".format(slugify(canton))
party_set_name = "2016 parties in {0} Canton".format(canton)
party_set = self.get_party_set(party_set_slug, party_set_name)
if not org.party_sets.filter(slug=party_set_slug):
print "adding party set {0}".format(party_set_slug)
org.party_sets.add(party_set)

def get_party_set(self, requested_party_set_slug, party_set_name):

This comment has been minimized.

Copy link
@mhl

mhl Jan 6, 2016

Contributor

Rather than asking interactively about whether to create the party sets, maybe it would be better to create all 81 at the beginning? (The canton names can be sourced from http://international.mapit.mysociety.org/areas/CRCANTON now.)

try:
return PartySet.objects.get(slug=requested_party_set_slug)
except PartySet.DoesNotExist:
self.stdout.write("Couldn't find the party set '{0}'".format(
requested_party_set_slug
))
all_party_sets = PartySet.objects.values_list('slug', flat=True)
if PartySet.objects.exists():
self.stdout.write("You might have meant one of these:")
for other_party_set_slug in all_party_sets:
self.stdout.write(u" " + other_party_set_slug)
self.stdout.write(
"Create the party set '{0}'? (y/n) ".format(
requested_party_set_slug
),
ending=''
)
response = raw_input()
if response.strip().lower() != 'y':
self.stderr.write("Exiting.")
return
return PartySet.objects.create(
slug=requested_party_set_slug, name=party_set_name
)

def handle(self, **options):
self.importer = PopItImporter()

with open(options['CSV-FILENAME']) as f:
csv_reader = csv.reader(f)
for party_data in csv_reader:
self.update_party(party_data)

0 comments on commit b9a472c

Please sign in to comment.