Wednesday, October 31, 2012

Java 7: File Filtering using NIO.2 - Part 2

Hello all. This is Part 2 of the File Filtering using NIO.2 series. For those of you who haven't read Part 1, here's a recap.

NIO.2 is a new API for I/O operations included in the JDK since Java 7. With this new API, you can perform the same operations performed with java.io plus a lot of great functionalities such as: Accessing file metadata and watching for directory changes, among others. Obviously, the java.io package is not going to disappear because of backward compatibility, but we are encouraged to start using NIO.2 for our new I/O requirements. In this post, we are going to see how easy it is to filter the contents of a directory using this API. There are 3 ways in order to do so, we already review one way in Part 1 and now we are going to see  another approach.

What you need
NetBeans 7+ or any other IDE that supports Java 7

Filtering content of a directory is a common task in some applications and NIO.2 makes it really easy. The classes and Interfaces we are going to use are described next:
  • java.nio.file.Path: Interface whose objects may represent files or directories in a file system. It's like the java.io.File but in NIO.2. Whatever I/O operation you want to perform, you need an instance of this interface.
  • java.nio.file.PathMatcher: Interface that allows objects to perform match operations on paths.
  • java.nio.file.DirectoryStream: Interface whose objects iterate over the content of a directory.
  • java.nio.file.Files: Class with static methods that operates on files, directories, etc.

The way we are going to filter the contents of a directory is by using objects that implement the java.nio.file.PathMatcher interface. We can get one of these objects with the help of the java.nio.file.Files class, using the method +getPathMatcher(String):PathMatcher. This method supports both "glob" and "regex" patterns. You can check Part 1 of File Filtering using NIO.2 for more information about "glob" and for "regex" visit the java.util.regex.Pattern class. The pattern is matched against the name of the files, directories, etc. That live inside the directory. This is important to remember, using this method you can only filter by the name of the file, directory, etc.

For example, if you want to filter .png and .jpg images, you should use one of the following syntax and pattern (notice the colon between the syntax and the pattern):
  • "glob:*.{png,jpg}"
  • "regex:([^\s]+(\.(?i)(png|jpg))$)"

Of course, "glob" syntax is much simpler, but you have the option of using regular expressions for a more detailed match. Anyway, you may be wondering why you should use this approach if the java.nio.files.DirectoryStream interface allows you to filter directly using "glob"... Well, let's suppose that you already have a filter, but you need to perform more than one filtering operation, that's when you need to use this approach.

The following piece of code defines a method which scans a directory using different patterns:

//in a class...
    
    /**
     * Scans the directory using the patterns passed 
     * as parameters. 
     * Only 3 patterns will be used.
     * @param folder directory to scan
     * @param patterns The first pattern will be used
     * as the glob pattern for the DirectoryStream.     
     */
    private static void scan(String folder, String... patterns) {
        //obtains the Images directory in the app directory
        Path dir = Paths.get(folder);
        //the Files class offers methods for validation
        if (!Files.exists(dir) || !Files.isDirectory(dir)) {
            System.out.println("No such directory!");
            return;
        }
        //validate at least the glob pattern
        if (patterns == null || patterns.length < 1) {
            System.out.println(
                "Please provide at least the glob pattern.");
            return;
        }

        //obtain the objects that implements PathMatcher
        PathMatcher extraFilterOne = null;
        PathMatcher extraFilterTwo = null;
        if (patterns.length > 1 && patterns[1] != null) {
            extraFilterOne = FileSystems.getDefault().
                                 getPathMatcher(patterns[1]);
        }
        if (patterns.length > 2 && patterns[2] != null) {
            extraFilterTwo = FileSystems.getDefault().
                                 getPathMatcher(patterns[2]);
        }

        //Try with resources... so nice!
        try (DirectoryStream ds = 
                  Files.newDirectoryStream(dir, patterns[0])) {
            //iterate over the content of the directory and apply 
            //any other extra pattern
            int count = 0;
            for (Path path : ds) {
                System.out.println(
                          "Evaluating " + path.getFileName());

                if (extraFilterOne != null && 
                    extraFilterOne.matches(path.getFileName())) {
                    System.out.println(
                                  "Match found Do something!");
                }

                if (extraFilterTwo != null && 
                    extraFilterTwo.matches(path.getFileName())) {
                    System.out.println(
                             "Match found Do something else!");
                }

                count++;
            }
            System.out.println();
            System.out.printf(
                 "%d Files match the global pattern\n", count);
        } catch (IOException ex) {
            ex.printStackTrace();
        }
    }

You can try invoking the last method with the following parameters:

  • C:\Images or /Images depending on your OS.
  • ?_*.jpg This pattern specifies that you want all .jpg images whose name starts with one digit followed by an underscore.
  • glob:2_* Specifies another filter (using glob syntax) where you only want items whose name starts with the number two followed by an underscore.
  • glob:3_* Specifies another filter (using glob syntax) where you only want items whose name starts with the number three followed by an underscore.

Having several filters allows you to take different actions for matched items.
Following is the result of the execution on my windows machine:



And on my Linux virtual machine:



Again, Write once, run everywhere! However, notice that the ordering of the items is system dependent, so do not ever hardcode the position of a file or directory.

I hope you enjoyed this post, there is another more powerful way to filter the content of a directory and we'll explore it in Part 3.

Click here to download the source code.


See ya!

References:

Reese Richard and Reese Jennifer (2012). Java 7 New Features Cookbook. United Kingdom: Packt Publishing Ltd.

No comments:

Post a Comment