-
Notifications
You must be signed in to change notification settings - Fork 16
/
dsv_tsv_csv.format.txt
65 lines (56 loc) · 4.53 KB
/
dsv_tsv_csv.format.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
DSV TSV CSV
ALTERNATIVES ==> #See serialization formats summary
STANDARDS ==> # - TSV: IANA MIME type
# - CSV: RFCs 4180, 7111
GOAL ==> #Simple way for an ARR|OBJ_ARR:
# - to persist it
# - to edit it with text editor
# - to parse it (good interoperability)
# - databases/spreadsheet/statistics programs can often open it
# - many Unix commands support delimiters: cut, paste, join, sort, uniq, awk
#No size limits
DSV ==> #"Delimiter-separated values"
#Encode an OBJ_ARR (if header) or ARR_ARR (if not header):
# - first dimension (records):
# - newline-separated
# - first record is header (optional)
# - blank records are ignored
# - second dimension (fields):
# - separated by any given character, e.g. , Tab Space | or :
# - escape delimiter/newline/" by encloding whole value with "..."
# - escape " by doubling it: ""
# - each record must have same number of fields
#Can be any charset
TSV ==> #"Tab-separated values"
#DSV using Tab as delimiter
#Extension: .tsv or .tab
#MIME type: text/tab-separated-values
#No escaping: tabs are disallowed in field values
CSV ==> #"Comma-separated values"
#DSV using , as delimiter
#Extension: .csv
#MIME type:
# - text/csv
# - parameters:
# - header "present|absent"
# - charset (def: ASCII)
#Newline is CR+LF (last one optional)
#Can use #FRAGMENT to select rows/cols/cells:
# - #row|col=NUMM[-NUMM2];...
# - #cell=ROW_NUMM,COL_NUMM[-ROW_NUMM2,COL_NUMM2];...
# - NUMM: NUM|*
# - NUM starts at 1
# - * means last
# - NUMM2 must be > NUMM
# - errors or non-existing ranges are ignored
VARIATIONS ==> #Implementations can sometimes be loose and not follow those standards.
#Possible variations:
# - delimiter
# - newline
# - mandatory|optional header
# - blank records: ignored or not
# - skipping last fields if empty
# - escaping
# - charset