script ยท python

Script to partition data by NUMBER elements and insert them into string with delimeter

Problem

Sometimes you need to split your data by specific number of lines and paste them to another string using delimiter.

Solution

Script partition.py
Partition STDIN by NUMBER of lines and put them into given string with delimeter.

Usage

python3 partition.py NUMBER REPLACE_STRING DELIMETER STRING_TO_INSERT_DATA  

For exampe

We have file with database identificators and would like to update data in table by this ids. But there are lots of ids and you would like to split them by 100.

cat merge.csv | python3 partition.py 2 __id__ '","' 'UPDATE wm2.catalog_good SET yml_id=7841 WHERE yml_id=2816 AND own_id IN ("__id__"); COMMIT;' > wm2.catalog_good.sql  
Where

merge.csv - file where you store identificators looks like:

1  
2  
3  
45  
33  

2 - partiton size (how many ids should be in 1 update)

__id__ - string to replace in STRINGTOINSERT_DATA

'","' - delimeter

UPDATE wm2.catalog_good SET yml_id=7841 WHERE yml_id=2816 AND own_id IN ("__id__"); COMMIT;' - string where we place our ids from merge.csv, at place marked as __id__ with delimeter '","' splitted by 2 elements

Result

File with content like:

UPDATE wm2.catalog_good SET yml_id=7841 WHERE yml_id=2816 AND own_id IN ("1", "2"); COMMIT;  
UPDATE wm2.catalog_good SET yml_id=7841 WHERE yml_id=2816 AND own_id IN ("3", "45"); COMMIT;  
UPDATE wm2.catalog_good SET yml_id=7841 WHERE yml_id=2816 AND own_id IN ("33"); COMMIT;  

Source

#!/usr/bin/python3

import sys

def process(strings, expression, replacestring, separator):  
    return expression.replace(replacestring, separator.join(strings))

if __name__ == "__main__":  
    if len(sys.argv) < 5:
        print(sys.argv[0] + " LIMIT REPLASESTRING SEPARATORSTRING EXPRESSION")
        print("  e.g.: " + sys.argv[0] + " 10 _replace_ , '{\"names\": [_replace_]}'")

    limit = int(sys.argv[1])
    replacestring = sys.argv[2]
    separator = sys.argv[3]
    expression = sys.argv[4]

    buffer = []

    for rawline in sys.stdin:
        line = rawline.strip()
        if len(line) == 0: continue
        buffer.append(line)
        if len(buffer) >= limit:
            print(process(buffer, expression, replacestring, separator))
            buffer.clear()

    if len(buffer) > 0:
        print(process(buffer, expression, replacestring, separator))

You can always download all my scripts on my GitHub Repo

Special thanks to Artem Shitov.

Published:
comments powered by Disqus