Plants vs. Animals prediction assignment.
You are a data scientist working for the United Kingdom Space Agency (UKSA) and you have been summoned to analyse some data from the secret space mission “Nereus”. Two UKSA space probes have recently arrived at the planet Nereus to collect data on the existing extraterrestrial life-forms living under water.
The first probe (probe A) collected data on 1000 life-forms measuring 4 chemical compounds for each lifeform {cryptonine, mermaidine, posidine, neraidine} each at 3 different chemical resolutions plus a further genetic attribute called TNA. The second probe (probe B) unfortunately malfunctioned during data transmission but before doing so we received a further dataset on 1000 life-forms without the TNA measurements.
UKSA brought in biology researchers from the University of Warwick and classified all the life-forms from the probe A dataset into plants (class 0) and animals (class 1). You are now being asked by UKSA to analyse the data and perform the following tasks:
Submit1 a python script predictClass.py that reads the original csv files probeA.csv, classA.csv and probeB.csv, and outputs another csv file called classB.csv with your class predictions (probabilities for class 1) for the probeB data. [6 Marks]
Submit a python script predictTNA.py that reads the original csv files probeA.csv, classA.csv and probeB.csv, and outputs another csv file called tnaB.csv with your TNA predictions for the probeB data. [5 Marks]
Which attribute is more predictive for whether the life-form is a plant or an animal? Submit a pdf file named answer.pdf that includes your answer with justification, your reasoning and any analysis supporting your reasoning and justification. [4 Marks]