[UPHPU] Help deduping, dedupe 2 text lists.

Ash ashovi at qwest.net
Wed Oct 29 09:34:20 MDT 2008


And we have a winner. Thanks. uniq is what I was missing.

Ash

Orson Jones wrote:
> thebigdog wrote:
>> Ash wrote:
>>> I have a text file (list 1) that has 10,000 records in it (addresses)
>>> and another that has 15,000 records (list 2). List 2 contains all the
>>> records that List 1 does. I need to get the 5,000 records that are in
>>> list 2 but not in list 1.
>>>
>>> I tried comm -3 list\ 1 list\ 2
>>>
>>> But it gave me back 25000 records. Comm -1 and comm -2 give 15000 and
>>> 10000 respectively.
>>>
>>> Any other way to do it?
>> diff the 2 files. just make sure that you sort each file properly.
> 
> yep, sort each file, then run diff or uniq.
> 
> sort file1.txt file2.txt | uniq -u > output.txt
> 
> I like the uniq method a bit better for this situation, but diff works if you know the correct options.
> 
> sort file1.txt > file1_sorted.txt
> sort file2.txt > file2_sorted.txt
> diff --funky-options-to-make-it-work-correctly file1_sorted.txt file2_sorted.txt > output.txt
> 
> Orson
> 
> _______________________________________________
> 
> UPHPU mailing list
> UPHPU at uphpu.org
> http://uphpu.org/mailman/listinfo/uphpu
> IRC: #uphpu on irc.freenode.net
> 
> 



More information about the UPHPU mailing list