Genomics, Evolution and Medicine
This program is useful if you have a directory of sevreal hundred files and you want to identify one/more that contain a sequence of interest.
I origionally wrote it to quickly query protein families to identify which files contain a sequence of interest. At present it is written to search all files containing '.prot' with a query of interest. You can change this to *.txt, *.fa or what ever file ending is suitable for your question.
To run save the text below to a text editor and call it Find.File.py. If you want to find which file ending in ".prot" contains the sequence name 'ENSG00000012048', then you would type
> python FindFile.py ENSG00000012048
#Open all files ending in .prot
for fileName in OpenFiles:
for line in open(fileName):
if Query in line:
#if your query is in any of these files
print Query, fileName
#print the query and the fileName
Copyright © All Rights Reserved