How to List All Files in Directory

Sometimes Python developers need to get a list of all files in a directory. This is commonly required if your application or website needs to display a list of files from a folder. There are several simple ways to to do this using Python

The Problem

Sometimes you may be building an application or a website, where you need to display a list of files in a directory, on a specific web page. For example, your app or website may need to feature an image gallery showing all uploaded images. Alternatively, you may need to iterate through all files in a folder and perform certain tasks on them. For example, you may need to go through all text files in a folder and process them. In all these cases, you will need to list all files in directory.

How to List All Files in Directory

Here are the different ways to list all files in directory.

1. Using listdir() function

Python provides os module that provides many functions to work with operating systems and files. Among them listdir() is a popular function that returns a list of both files and sub directories in a directory. Here is its syntax.

os.listdir(path)

In the above command, you need to provide the directory path for which you want to list files and subdirectories. It can be a relative or absolute path as per your requirement.

Here is an example to get a list of files and subdirectories in directory ‘/home/ubuntu’.

import os
print(os.list('/home/ubuntu'))

Here is the output.

['manuel.txt', '.qws', '.gnupg', ..., '.bashrc', '.wget-hsts', '.npm', '.viminfo', '.local']

Here is the command to view names of all files & directories in current directory.

os.listdir('.')

You can also use this command to list files & directories in parent directory.

os.listdir('..')

2. Using walk() function

OS module also provides a walk() function that takes path of the directory to be explored, and returns generator object containing 2 lists of files and directories. We need to loop through the generator to extract their names. Here is an example where we call os.walk() function on /home/ubuntu directory. We loop through the generator object to get the directory and file names and store them in another list.

from os import walk

f = []
my_path = '/home/ubuntu'
for (dirpath, dirnames, filenames) in walk(my_path):
f.extend(filenames)
break

print(f)

Here is a sample output.

['LICENSE.txt', 'MySQL-python-wininst.log', ... , 'README.txt', 'RemoveMySQL-python.exe', 'w9xpopen.exe']

Since os.walk() returns a generator object instead of final list, it is very memory efficient. So it is suitable if your directory contains large number of files and subdirectories. Also, since we loop through the iterator, we can selectively list files and sub directories instead of extracting all of them.

Here is a shorter way to write the same code, if you only want to extract file names.

filenames = next(walk(my_path), (None, None, []))[2]

3. Using glob.glob() function

Python also provides glob module that allows you to effectively list files and directories. It provides glob() function that not only lists files and directories but allows you to do pattern matching. Here is its syntax.

glob.glob(path_with_pattern)

In the above command, you need to provide path to the directory where you want glob() function to work, along with a pattern to match its files and sub directories.

Please note, the above command will list all files and directories except ones that begin with dot(.) since they are hidden.

Here is an example to list all files and directories in /home/ubuntu.

import glob
print(glob.glob('/home/ubuntu/*'))

Here is a sample output.

['/home/ubuntu/remove-old-snaps', '/home/ubuntu/gpush.sh', ..., '/home/ubuntu/tweet.py', '/home/ubuntu/t']

Please note, you need to provide wildcard character or other pattern to match the files and subdirectories in our target folder. Else it will only return the input directory. The following command will not give desired result.

print(glob.glob('/home/ubuntu/'))

If you want to only list .txt files in target directory, then mention it in your input directory pattern.

print(glob.glob('/home/ubuntu/*.txt'))

If you want to list all .txt files in current directory, you can simply mention the pattern matching for file extension.

print(glob.glob('*.txt'))

4. List Only Files

We have seen that os.listdir(), os.walk() and glob.glob() are the most popular ways to list files and directories in Python. Now let us look at some common use cases. Often Python developers need to list only files. You can do this using listdir() and isfile() functions as shown below.

from os import listdir
from os.path import isfile, join
my_path = '/home/ubuntu'
files = [f for f in listdir(my_path) if isfile(join(my_path, f))]

In the above code, we first import listdir function from os module. We also import isfile and join functions from os.path. We use listdir() function to get a list of files and directories together. We use a list comprehension to loop through this list. In each iteration, we use join() function to obtain the full path to each file and directory. We also call isfile() function to check if the path is a file or directory. For all paths that return true for isfile() function, we store the path in files list. In the end, this list will contain all file paths only.

If you are using os.walk() you can use the following short code for this purpose.

filenames = next(walk(my_path), (None, None, []))[2]

os.walk() function returns a generator object to both files and directories. We loop through it to extract only file path.

5. List Specific File Types

If you want to extract specific file types, then using glob.glob() function is your best bet. It allows you to easily specify a wide range of patterns to match file types. Here is an example to list all pdf files in /home/ubuntu directory.

glob.glob('/home/ubuntu/*.pdf')

In the above code, /home/ubuntu/*.pdf stands for pattern to match all PDF files in /home/ubuntu. Here is a command to list all files starting with letter ‘a’ in /home/ubuntu.

from glob import glob
from os.path import isfile, join
my_path = '/home/ubuntu/a*'
files = [f for f in glob.glob(my_path) if isfile(join(my_path, f))]

In the above code, we import glob() function, along with isfile and join from os.path. Our pattern for path is ‘/home/ubuntu/a*’ meaning all files and directories starting with ‘a’. We use a list comprehension to loop through the list of paths in result of glob.glob() function. In each iteration, we use join() function to get full file path. We also call isfile() function to check if the path is file or not. Those paths that return true, are added list named files.

6. List Subdirectories

If you want to list only subdirectories, instead of files then you can slightly modify the above mentioned solution to get only directories. Here is an example using listdir() function.

from os import listdir
from os.path import isfile, join
my_path = '/home/ubuntu'
files = [f for f in listdir(my_path) if not isfile(join(my_path, f))]

In the above code, we import listdir function from os module. We also import isfile and join functions from os.path module. listdir function lists all files and folders in specific directory. We loop through this list and in each iteration call join function to get full path to file or directory. We also call isfile() function in each iteration. If this function returns false, we store the path in files list.

You can similarly use glob.glob() function for this purpose. It provides greater flexibility with the ability to specify wildcard characters in folder path. Here is an example to list sub directories starting with letter ‘a’ in /home/ubuntu/

from glob import glob
from os.path import isfile, join
my_path = '/home/ubuntu/'
files = [f for f in glob(my_path) if not isfile(join(my_path, f))]

7. List Parent Folders

You can use listdir() function to easily list files and directories in parent directory.

os.listdir('..')

Conclusion

In this article, we have learnt several different ways to list files and directories in Python. You can use any of these methods as per your requirement. If you have too many files and directories in your folder, use os.walk() since it returns a generator object to the list of files and folders. So it is fast and does not occupy much memory. If you are looking for certain amount of flexibility in specifying folder paths, extracting files of a specific file type, then use glob.glob() function.

Also read:

How to Delete File or Folder in Python
How to Copy Files in Python
How to Iterate Ove Row in Pandas DataFrame

Leave a Reply

Your email address will not be published. Required fields are marked *