Awk using input from pipe

awkgrep

I have a (very) basic understanding of AWK, I have tried a few ways of doing this but all print out far more lines than I want:

I have 10 lines in file.1:

chr10   234567
chr20   123456
...
chrX    62312

I want to move to uppercase and match the first 2 columns of file.2, so first line below matches second line above, but I don't want to get second line below which matches third line above for position but not chr, and I don't want the first line below to match the first line above.

CHR20   123456    ...   234567 
CHR28   234567    ...   62312

I have:

$ cat file.1 | tr '[:lower:]' '[:upper:]' | <grep? awk?>

and would love to know how to proceed. I had used a simple grep – previously but the second column of file.1 matches more in the searched file so I get hundreds of lines returned. I want to just match on the first 2 columns (they correspond to the first 2 columns in the file.2).

Hope thats clear enough for you, look forward to your answers=)

Best Answer

If the files are sorted by the first column you can do:

join -i file.1 file.2 ¦ awk '$3==$2{ $3=""; print}'

If they're not sorted, sort them first.

The -i flag says to ignore case.

That won't work if there are multiple lines with the same field in the first column. To make that work you would need something more complicated