diff --git a/python/introduction/part-2.files/README.md b/python/introduction/part-2.files/README.md new file mode 100644 index 000000000..9621af917 --- /dev/null +++ b/python/introduction/part-2.files/README.md @@ -0,0 +1,287 @@ +# Introduction to Python: FILES + +In Python, dealing with files is very common and is a very important part of +programming for a number of reasons:
+ +* Applications may need to read configuration files +* In Data science, data is often sourced from files (`CSV`, `XML`, `JSON`, etc) +* Data is often analysed in Python when its written in different stages of analysis +* DevOps engineers often stores the state of infrastructure or data as files for automation purposes. + +Files are not the endgame for storage.
+Remember there are things like Caches and Databases.
+But before learning those things, file handling is the best place to start.
+ +## Python Dev Environment + +The same as Part 1, we start with a [dockerfile](./dockerfile) where we declare our version of `python`. + +``` +cd python\introduction\part-2.files + +docker build --target dev . -t python +docker run -it -v ${PWD}:/work python sh + +/work # python --version +Python 3.9.6 + +``` + +## Our application + +Firstly we have a class to define what a customer looks like: +``` +class Customer: + def __init__(self, c="",f="",l=""): + self.customerID = c + self.firstName = f + self.lastName = l + def fullName(self): + return self.firstName + " " + self.lastName +``` + +Then we need a function which returns our customers: +``` +def getCustomers(): + customers = { + "a": Customer("a","James", "Baker"), + "b": Customer("b", "Jonathan", "D"), + "c": Customer("c", "Aleem", "Janmohamed"), + "d": Customer("d", "Ivo", "Galic"), + "e": Customer("e", "Joel", "Griffiths"), + "f": Customer("f", "Michael", "Spinks"), + "g": Customer("g", "Victor", "Savkov"), + "h" : Customer("h", "Marcel", "Dempers") + } + return customers +``` + +Here is a function to return a specific customer: +``` +def getCustomer(customerID): + customer = getCustomers() + return customer[customerID] +``` + +## Opening Files + +Python provides an `open` function to open files.
+`open()` takes a file path\name and access mode + +``` +"r" - Read - Default value. Opens a file for reading, error if the file does not exist +"a" - Append - Opens a file for appending, creates the file if it does not exist +"w" - Write - Opens a file for writing, creates the file if it does not exist +"x" - Create - Creates the specified file, returns an error if the file exists +``` + +Try open a file that holds our customer data: + +``` +open("customers.log") +``` + +We can see the file does not exist: + +``` +/work # python src/app.py +Traceback (most recent call last): + File "/work/src/app.py", line 26, in + open("customers.log") +FileNotFoundError: [Errno 2] No such file or directory: 'customers.log' +``` + +Let's use what we learned (`if` statements), to check if the file exists! +We'll need a built in library for handling files + +``` +import os.path +``` + +Then we can use the `os.path.isfile("customers.log")` command to check if the file exists + +``` +os.path.isfile("customers.log") +``` + +Using `if` logic we can check if the file is there: + +``` +if os.path.isfile("customers.log"): + print("file exists") +else: + print("file does not exists") +``` + +Now we know the file does not exist, but if it did, we can now read it with `open` + + +``` +f = open("customers.log") +``` + +Let's also loop each customer in the file and print it + +``` +for customer in f: + print(customer) +f.close() +``` + +Now we know the file does not exist, let's create it! + +``` +customers = getCustomers() +for customerID in customers: + c = customers[customerID] + f.write(c.customerID + "," + c.firstName + "," + c.lastName) +``` + +Now if we run our code the first time, it will create and populate the file as it does not exist, +and will read the file and display the content on the second run.
+ +Instead of looping each line in the file, we can read the entire file with the file's `read()` function: + +``` +print(f.read()) +``` + +## Comma-Separated Values : CSV + +As we can see, our `customers.log` file is in CSV format with every field separated by commas.
+ +So far, we've demonstrated using primitives to read and write to files to store our data. +When looping data structures like dictionaries and writing each line one by one to a file +will use a lot of CPU if the data is large.
+ +### CSV: Reading our file + +To work with CSV's, we need to import a library +We also need to add headers to our file so it makes setting fields easier: + +``` +customerID, firstName, lastName +``` + +``` +import csv +with open('customers.log', newline='') as customerFile: + reader = csv.DictReader(customerFile) + for row in reader: + #print(row) + print("customer id:" + row['customerID'] + " fullName : " + row['firstName'] + " " + row['lastName']) + +``` +### CSV: Writing our file + +Create an array with our field headers + +``` +fields = ['customerID', 'firstName', 'lastName'] +with open('customers.log', 'w', newline='') as customerFile: + writer = csv.writer(customerFile) + writer.writerow(fields) + customers = getCustomers() + for customerID in customers: + customer = customers[customerID] + writer.writerow([customer.customerID, customer.firstName, customer.lastName]) +``` + +## Putting it all together + +Now that we have code that reads and writes to a file, let's update our `getCustomers` function to return +customers from our file.
+ +We read the file if it exists, read it into a list and convert the list to a dictionary: + +``` +def getCustomers(): + if os.path.isfile("customers.log"): + with open('customers.log', newline='') as customerFile: + reader = csv.DictReader(customerFile) + l = list(reader) + customers = {c["customerID"]: c for c in l} + return customers + else: + return {} +``` + +We can test our function to see it working: + +``` +customers = getCustomers() +for customerID in customers: + print(customers[customerID]) +``` + +Let's also create a function to update customers + +``` +def updateCustomers(customers): + fields = ['customerID', 'firstName', 'lastName'] + with open('customers.log', 'w', newline='') as customerFile: + writer = csv.writer(customerFile) + writer.writerow(fields) + for customerID in customers: + customer = customers[customerID] + writer.writerow([customer.customerID, customer.firstName, customer.lastName]) +``` + +Let's test our two functions by deleting our file and recreate it using our functions: + +``` +customers = { + "a": Customer("a","James", "Baker"), + "b": Customer("b", "Jonathan", "D"), + "c": Customer("c", "Aleem", "Janmohamed"), + "d": Customer("d", "Ivo", "Galic"), + "e": Customer("e", "Joel", "Griffiths"), + "f": Customer("f", "Michael", "Spinks"), + "g": Customer("g", "Victor", "Savkov"), + "h" : Customer("h", "Marcel", "Dempers") +} + +#save it +updateCustomers(customers) + +#add another test customer +test = Customer("t", "Test", "Customer") +customers["t"] = test + +#save it +updateCustomers(customers) + +#see the changes +customers = getCustomers() +for customer in customers: + print(customers[customer]) +``` + +## Docker + +Let's build our container image and run it while mounting our customer file + +Our final `dockerfile` +``` +FROM python:3.9.6-alpine3.13 as dev + +WORKDIR /work + +FROM dev as runtime +COPY ./src/ /app + +ENTRYPOINT [ "python", "/app/app.py" ] + +``` + +Build and run our container. +Notice the `customers.log` file get created if it does not exists. + +``` +cd python\introduction\part-2.files + +docker build . -t customer-app + +docker run -v ${PWD}:/work -w /work customer-app + +``` \ No newline at end of file diff --git a/python/introduction/part-2.files/dockerfile b/python/introduction/part-2.files/dockerfile new file mode 100644 index 000000000..7666e0356 --- /dev/null +++ b/python/introduction/part-2.files/dockerfile @@ -0,0 +1,8 @@ +FROM python:3.9.6-alpine3.13 as dev + +WORKDIR /work + +FROM dev as runtime +COPY ./src/ /app + +ENTRYPOINT [ "python", "/app/app.py" ] \ No newline at end of file diff --git a/python/introduction/part-2.files/src/app.py b/python/introduction/part-2.files/src/app.py new file mode 100644 index 000000000..42ab10bcf --- /dev/null +++ b/python/introduction/part-2.files/src/app.py @@ -0,0 +1,75 @@ +import os.path +import csv + +class Customer: + def __init__(self, c="",f="",l=""): + self.customerID = c + self.firstName = f + self.lastName = l + def fullName(self): + return self.firstName + " " + self.lastName + +def getCustomers(): + if os.path.isfile("customers.log"): + with open('customers.log', newline='') as customerFile: + reader = csv.DictReader(customerFile) + l = list(reader) + customers = {c["customerID"]: c for c in l} + return customers + else: + return {} + +def updateCustomers(customers): + fields = ['customerID', 'firstName', 'lastName'] + with open('customers.log', 'w', newline='') as customerFile: + writer = csv.writer(customerFile) + writer.writerow(fields) + for customerID in customers: + customer = customers[customerID] + writer.writerow([customer.customerID, customer.firstName, customer.lastName]) + +def getCustomer(customerID): + customer = getCustomers() + return customer[customerID] + +# if os.path.isfile("customers.log"): +# with open('customers.log', newline='') as customerFile: +# reader = csv.DictReader(customerFile) +# for row in reader: +# print("customer id:" + row['customerID'] + " fullName : " + row['firstName'] + " " + row['lastName']) +# else: +# fields = ['customerID', 'firstName', 'lastName'] +# with open('customers.log', 'w', newline='') as customerFile: +# writer = csv.writer(customerFile) +# writer.writerow(fields) +# customers = getCustomers() +# for customerID in customers: +# customer = customers[customerID] +# writer.writerow([customer.customerID, customer.firstName, customer.lastName]) + +customers = { + "a": Customer("a","James", "Baker"), + "b": Customer("b", "Jonathan", "D"), + "c": Customer("c", "Aleem", "Janmohamed"), + "d": Customer("d", "Ivo", "Galic"), + "e": Customer("e", "Joel", "Griffiths"), + "f": Customer("f", "Michael", "Spinks"), + "g": Customer("g", "Victor", "Savkov"), + "h" : Customer("h", "Marcel", "Dempers") +} + +#save it +updateCustomers(customers) + +#add another test customer +test = Customer("t", "Test", "Customer") +customers["t"] = test + +#save it +updateCustomers(customers) + +#see the changes +customers = getCustomers() +for customer in customers: + print(customers[customer]) +