-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for filae.com #28
Comments
Let me investigate how to query data from the site and how difficult it would be to create a parser for it. Do you have a link to a public profile or tree I could use for testing? One challenge will be that I don't speak French and I don't see an option to set the site language to English. I also have a request in for geneanet.org, so I'll be looking at that as well. |
I'm afraid you need a premium account to see profiles. They do give you a 1 month trial though (which I'm currently using). Geneanet would be interesting, too. |
I could try to code it already. Do you need a Geni Pro account to use this? |
You need a Geni Pro account to be able to copy over family members (API restriction on Geni). I often use http://www.nosorigines.qc.ca for free French / Canadian trees, so I've considered adding that. What I've tried to do thus far is focus on either the most popular or free sites. |
So getting a myheritage data subscription only makes sense with geni pro then ? |
I think the MyHeritage Data subscription has a lot of value for research. SmartCopy just makes it easier to copy that data from MyHeritage to Geni. But SmartCopy can work on many websites, to include trees and records at FamilySearch. |
Starting implementation in https://github.com/raphink/SmartCopy/tree/filae |
I have to say, geneanet.org is horribly constructed. It's so unstructured. No classes, ids, uggg. It's one of the toughest pages I've seen for parsing. |
Yes indeed. It's a mess of a DOM... |
In some cases, it has been easier to grab the relationships from the initial page and then actually download and parse the parent pages, siblings, children. The data structure for the profile is usually more complete (such as containing the gender) and easier to parse, than trying to grab it out of the relationship info on the focus page. Here is one page I was using as a test. The death date is giving several profiles a problem. http://gw.geneanet.org/genevtabouis?lang=en&pz=sosa+fictif&nz=legoutiere&ocz=0&p=augustine+madeleine+augustine&n=riviere |
|
Is filae.com still something you're working on? I hadn't seen any changes to that branch. |
No I haven't worked on it really. The main reason is that I hardly use the genealogical trees I find there as there's many more trees on Geneanet. So the main value of Filae is records, but they're not really parseable most of the time. I might want to work on this again later, but for now it's not worth the effort. |
I'm a french python codder and I wrote a little parser to go through geneanet. I was looking for also parsing the data inside the numerised microfilm of f i l a e . com and get all the indexed reccord. I have also an access to it with my familly tree local association. F i l a e stole every numerised picture of every departement of france and profite of a juridic void to make index it, billions of picture was reached by this companie and all of them were sended to other companie to process the image and make indexation (Like familly search index image research). like f i l a e I also make parsing for belgium picture (i can download them in HD) and same for 62, 80, 59, and 02 departemental archives. All in python. if you will where interested, contact me at: yoan [dot] bouzin {at} gmail [dot] com |
@Tuisto59 we already have a parser for geneanet. The problem with filae is parsing the dom that is pretty bad. |
In France, Filae.com has lots of informations and trees. Would you be willing to accept a PR to support it?
The text was updated successfully, but these errors were encountered: