[UPHPU] Help deduping, dedupe 2 text lists.
Orson Jones
orson.uphpu at afriskito.net
Tue Oct 28 16:50:13 MDT 2008
thebigdog wrote:
> Ash wrote:
>> I have a text file (list 1) that has 10,000 records in it (addresses)
>> and another that has 15,000 records (list 2). List 2 contains all the
>> records that List 1 does. I need to get the 5,000 records that are in
>> list 2 but not in list 1.
>>
>> I tried comm -3 list\ 1 list\ 2
>>
>> But it gave me back 25000 records. Comm -1 and comm -2 give 15000 and
>> 10000 respectively.
>>
>> Any other way to do it?
>
> diff the 2 files. just make sure that you sort each file properly.
yep, sort each file, then run diff or uniq.
sort file1.txt file2.txt | uniq -u > output.txt
I like the uniq method a bit better for this situation, but diff works if you know the correct options.
sort file1.txt > file1_sorted.txt
sort file2.txt > file2_sorted.txt
diff --funky-options-to-make-it-work-correctly file1_sorted.txt file2_sorted.txt > output.txt
Orson
More information about the UPHPU
mailing list