Sometimes, you don't have the complete name of the file that you intend to read or write in your transformation. That can be because the name of the file depends on a field or on external information. Suppose you receive a text file with information about new books to process. This file is sent to you on a daily basis and the date is part of its name (for example, newBooks_20100927.txt)
.
In order to follow this recipe, you must have a text file named newBooks_20100927.txt
with sample book information such as the following:
"Title","Author","Price","Genre" "The Da Vinci Code","Dan Brown","25.00","Fiction" "Breaking Dawn","Stephenie Meyer","21.00","Children" "Foundation","Isaac Asimov","38.50","Fiction" "I, Robot","Isaac Asimov","39.99","Fiction"
Carry out the following steps:
today
, and in the Type listbox, select System date (variable)
. today
yyyyMMdd
.In the recipe, the file is saved in the same directory as the transformation. In order to get this directory, you have to get it as a field in your dataset. That's the purpose of the next step.
path
. In the Variable column, press Ctrl+Space in order to show the list of possible variables, and select Internal.Transformation.Filename.Directory
. filename
(type it in the New field column), and type path + "/newBooks_" + today +".txt"
in the Java Expression column. Previewing this step, you will obtain the complete path for the file, for example, file:///C:/myDocuments/newBooks_20100927.txt
. filename
.
, in the Separator, and set the header to 1
line. Title
(String), Author
(String), Price
(Number), Genre
(String).You can't use the Get Fields button in this case because the name of the file will be set dynamically. In order to obtain the headers automatically, you can fill the File tab with the name of a sample file. Then, clicking on the Get Fields button, the grid will be populated. Finally, you must remove the sample file from the File tab and set the Accept filenames from previous step section again.
When you have to read a file and the filename is known only at the moment you run the transformation, you cannot set the filename explicitly in the grid located under the File tab of the Input step. However, there is a way to provide the name of the file.
First, you have to create a field with the name of the file including its complete path.
Once you have that field, the only thing to do is to configure the Accept filenames from previous step section of the Input step specifying the step from which that field comes and the name of the field.
In the recipe, you didn't know the complete name because part of the name was the system date, as for example, C:/myDocuments/newBooks_20100927.txt
. In order to build a field with that name, you did the following:
These steps are among the most used for these situations. However, the steps and the way of building the field will depend on your particular case.
In the recipe, you used a Text File Input step, but the same applies for other Input steps: Excel Input, Property Input, and so on.
It may happen that you want to read a file with a CSV file input step, but notice that it doesn't have the option of accepting the name of the file from a previous step. Don't worry! If you create a hop from any step toward this step, the textbox named The filename field (data from previous steps) will magically show up, allowing the name to be provided dynamically.
This method for providing the name of the file also applies when you write a file by using a Text file output step.
What follows is a little background about the Get System Info step used in the recipe. After that, you will see how the Accept file name from field? feature can be used in the generation of files.
You can use the Get System Info step to retrieve information from the PDI environment. In the recipe, it was used to get the system date, but you can use it for bringing and adding to the dataset other environmental information, for example, the arguments from the command line, the transformation's name, and so on.
You can get further information about this step at the following URL:
Let's assume that you want to write files with book information, but a different file for each genre. For example, a file named fiction.txt
with all the fiction books, another file named children.txt
with the children books, and so on. To do this, you must create the name of the file dynamically as shown in the recipe. In this case, supposing that your dataset has a field with the genre of the book, you could create a Java Expression that concatenates the path, the field that has the genre, and the string .txt
. Then, in the Text file output step, you should check the checkbox named Accept file name from field? and in the File name field listbox, select the field just created.
Running this transformation will generate different text files with book's information; one file for each genre.