Apply functions to row header #21

ArcticTee · 2018-11-07T11:50:32Z

Hi,

first of all, this is a great library, thank you for keeping this up! To enhance the functionality I would like to do the following:

I just tried to import a sheet which has a header row with duplicates like this:

test | test | 1
1 | 2 | 2
2 | 3 | 2

This will result in a json object like this:

[
  {"test": 2, "1": 2},
  {"test": 3, "1": 2},
]

I would like get the option to apply multiple functions to the header row to transform the column names when using the auto columnToKey mapping - maybe something like this:

const result = excelToJson({
	sourceFile: 'SOME-EXCEL-FILE.xlsx',
	columnToKey: {
		'*': '{{columnHeader}}',
                 'transformSingleKey':
                                              [
                                                 removeSpecialCharacters,
                                                 replaceNumberColumnNames,
                                              ]
                 'transformAllKeys':
                                              [
                                                 suffixDuplicates
                                              ]
	}
});

Where the transformSingleKey - Array will map over the columnnames and apply all functions

const cleandedUpCols = colnames.map(
      pipeFunctions(
        removeSpecialCharacters,
        replaceNumberColumnNames,
      ),
    );

And the transformAllKeys - Array will work the whole column array

suffixDuplicates(cleanedUp)

Helper functions:

const suffixDuplicates = (list) => {
  // Containers

  const count = { };
  let itemCount;
  return list.map((item) => {
    itemCount = count[item];
    itemCount = itemCount === null ? 1 : itemCount + 1;
    count[item] = itemCount;
    // starting with 2
    return itemCount > 1 ? `${item}_${itemCount}` : item;
  });
};
const removeSpecialCharacters = key => key.replace(/[^a-zA-Z0-9]/g, '');

export const replaceNumberColumnNames = key => (Number.isNaN(key) ? key : `c${key}`);


const pipeFunctions = (...funcsToApply) => (currentValue,
  index,
  arrayToMap) => funcsToApply.reduce(
  (accum,
    cFunc) => cFunc(accum, index, arrayToMap), currentValue,
);

Thank you

The text was updated successfully, but these errors were encountered:

DiegoZoracKy · 2018-11-07T13:43:15Z

Hi @ArcticTee, thanks for bringing that case, and a solution which I believe is a very good one. And it seems to match with a suggestion that I've left open a few months ago. Take a look at this issue and tell me what you think of this approach:

#4

ArcticTee · 2018-11-07T14:12:55Z

Yeah, this option is fine as long as you know the name of your columns.

I am currently working on a general import of Excel files and I do not know the column format of that files when implementing it. The user of this import tool shall give the sheet name and if the file contains (and if so where the header row lies) in a GUI.

Therefore I would be happy if the enhancement for single keys and all keys for unknown columns would be added 👍

DiegoZoracKy · 2018-11-07T14:44:28Z

I believe that both ideas can be joined. In fact, I can see two different issues here.

The first one is to never let data be discarded due to multiple column headers with the same name. By default it should suffixDuplicates as you suggested, and this behavior could be disabled via config.

The second issue is about the interface to apply a pipeline of data transformations.

I believe that the first one is what will solve your current problem, am I right?

ArcticTee · 2018-11-07T14:56:28Z

I believe that the first one is what will solve your current problem, am I right?

Just partly. Because it is just one transformation our tool needs to do. We are using a MongoDB which accepts JSON inputs.

E.g. this suffixDuplicate will prevent data loss during the import. But it needs to look at all the keys, to find duplicates.

But another practical problem the pipeline/transformation would solve is that invalid column names (which can be checked by just looking at a single key) can be resolved. e.g. replaceNumberColumnNames will add a 'c' prefix to all columns that are just a number. For mongoDB this is necessary because they do not support column names that are just a number.

So for my use case it would be great if pipelines for single and all keys would be supported. The suffixDuplicates is just a function for our import tool, which this project can benefit from as well.

ArcticTee · 2018-11-20T10:38:37Z

@DiegoZoracKy Any updates on this? Let me know if I can help you.

DiegoZoracKy · 2018-11-21T15:46:07Z

I'm almost with no free time during this month. Probably, from next week onwards I'll be able to get back to this issue. The point is to define a good interface for us to configure the transformations.

…

On Tue, Nov 20, 2018, 8:38 AM ArcticTee ***@***.*** wrote: @DiegoZoracKy <https://github.com/DiegoZoracKy> Any updates on this? Let me know if I can help you. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB_oP6znGw42pj5r7EsMxfLFgOddHVsSks5uw9utgaJpZM4YSSPj> .

ArcticTee · 2018-11-23T08:51:59Z

Yeah. I am happy you are on it.

tijevlam · 2022-08-22T07:32:57Z

hi there, since this has been a long time. Is there any news on this issue? In my case i would like to apply a camelCase function to the headerrow but applying a function to {{columnHeader}} results in undefined keys.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply functions to row header #21

Apply functions to row header #21

ArcticTee commented Nov 7, 2018

DiegoZoracKy commented Nov 7, 2018

ArcticTee commented Nov 7, 2018

DiegoZoracKy commented Nov 7, 2018

ArcticTee commented Nov 7, 2018 •

edited

Loading

ArcticTee commented Nov 20, 2018

DiegoZoracKy commented Nov 21, 2018 via email

ArcticTee commented Nov 23, 2018

tijevlam commented Aug 22, 2022

Apply functions to row header #21

Apply functions to row header #21

Comments

ArcticTee commented Nov 7, 2018

DiegoZoracKy commented Nov 7, 2018

ArcticTee commented Nov 7, 2018

DiegoZoracKy commented Nov 7, 2018

ArcticTee commented Nov 7, 2018 • edited Loading

ArcticTee commented Nov 20, 2018

DiegoZoracKy commented Nov 21, 2018 via email

ArcticTee commented Nov 23, 2018

tijevlam commented Aug 22, 2022

ArcticTee commented Nov 7, 2018 •

edited

Loading