Problems with text extracting #108

KaradasV3 · 2022-04-12T23:04:25Z

I have a problem with extracting parts of the text from some pdf files. It looks as if several threads are returning their result at the same time. There may also be problems with recognizing images in the file, although it doesn't seem to be the rule. Has anyone else had such a problem? Can I add any options to fix this?

This is how it looks like after extracting with Cermine:
finally we selected the best learning model to classify the samples into the three groups.
and finally we selected the best learning model to classify the samples into the three groups.
analysis and a feature selection procedure. Then, we developed a Cluster analysis to evaluate dataset homogeneity, and
22..33..DDaattaaPPrree--PPrroocceessssiinngg
TToo rereaaddanadnpdrepprroecpersoscreasws draatwa, ddifafetare,ndtpifafeckreangtes pinacthkea gfreasmeinwotrhkeR f(rhatmtpesw:/o/rwkwRw.
(rh-tptprosj:e//cwt.owrwg/.r,-apcrcoejesscet.dorogn/,2a4ccMesasyed20o2n1)2[415M] awyer2e02u1s)ed[1.5T] hweefroeuursdeadt.aTsehtes fwoeurreddaetraisveetds
wfreorme dtweroivdeidffefrreonmt ttewchondoilfofegrieens:t Atefcfyhmnoeltorgixieasn:dAAffygmileentrt.ixToanpdroAcegsisleAntf.fyTmoeptrrioxceCsEsLAfifflyes-,
mweetruisxeCdEthLef‘iolelisg, ow’ e[1u6s]epdacthkeag‘oe.liTghoe
And then the correct text

Same part of tekst extracted with other tool:

Figure 1. Flowchart of the proposed approach. After merging the four datasets, we implemented a differential expression
analysis and a feature selection procedure. Then, we developed a Cluster analysis to evaluate dataset homogeneity, and
finally we selected the best learning model to classify the samples into the three groups.
2.3. Data Pre-Processing
To read and preprocess raw data, different packages in the framework R
(https://www.r-project.org/, accessed on 24 May 2021) [15] were used. The four datasets were derived from two different technologies: Affymetrix and Agilent. To process Affy-metrix CEL files, we used the ‘oligo’ [16] package.

kwhkim · 2022-09-06T05:02:15Z

@KaradasV3 I am looking for the alternative for the same reason. Can you tell me what is the other tool you used?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with text extracting #108

Problems with text extracting #108

KaradasV3 commented Apr 12, 2022

kwhkim commented Sep 6, 2022

Problems with text extracting #108

Problems with text extracting #108

Comments

KaradasV3 commented Apr 12, 2022

kwhkim commented Sep 6, 2022