Working with structured data

In this column we look at writing data to files in a structured format and how to read it back in that format.

Python vs. PHP: Choosing your next project's language

Writing To Files

To read files, we have focused on the fgets() function. This function relies on a file descriptor in order to know from which file to read, and which line of a file to read. Likewise, we will use the appropriately named fputs() function to write to a file. The following example shows how easy it is to use fputs().

$s = "The time is ".date('D M d H:i:s T Y')."\n";
$fp = fopen("./","a");
$i = fputs($fp,"--\n".$s);
fputs($fp,"Wrote $i bytes to file\n");
echo "Wrote $i bytes to file\n";

View this script from your PHP-enabled Web server (see the cover CD for information on setting up a PHP-enabled Web server if you do not have one). This script writes three lines of data to a file: a separator, the time and the number of bytes written to the file on the second line (collectively, an 'item'). It also outputs this final line to the Web user.

Notice that the script opens in mode 'a' - that is, opened for writing only, created if it does not exist with a file pointer placed after the last line of the file. This means that when fputs.php is viewed a second time, the latest time will be appended to the end of A useful exercise would be to change this script to place the most recent write to at the beginning of the file.

You may also have noticed that the file pointer is closed with the fclose() function. It is important that this takes place when writing to a file opened with fopen(), since if the file pointer is not closed all data written to the file will be lost.

Reading Structured Data

In order to read back the data from one item at a time, we will need to approach the reading of the file differently than last month. This requires the construction of an algorithm that locates the first line of an item by the "--\n" separator and assumes that the next two lines are the time and bytes written strings (more sophisticated algorithms in later months will not make such assumptions). This kind of algorithm is called a parser (pronounced 'par-zer').

$fp = fopen("./","r");
while(($s = fgets($fp,1024)) && !feof($fp)) {
if($s[0] == '-' && $s[1] == '-') {
if(($time = fgets($fp,1024)) && ($bytes = fgets($fp,1024))) {
echo "<TR>\n<TD>\n$time</TD>\n\n";
echo "<TD>\n$bytes</TD>\n</TR>\n";
} else {
echo "<TR><TD>Data file corrupted!\n</TD></TR>";
} else {
echo "<TR><TD>Data file corrupted!\n</TD></TR>";

Save this as parser1.php in the same directory as the file and request it from your Web server. All the items you have created with fputs.php are returned in an HTML table, which reflects the structure of our data.

parser1.php introduces a few new features of PHP as well as the concept of parsing. To look at this step by step: first, the script opens file for reading and places the file pointer at the beginning of the file. It then enters into a loop reading one line at a time. The conditions for the loop continuing are a) that the call to fgets() returns a true result (that is, it reads from and b) the current file pointer is not at the end of the file.

These two conditions are conjoined by a 'logical AND' - &&. That is, only when both conditions are true is the loop executed (a logical OR, denoted by ||, would have allowed either value to be true for the loop to be executed). See for more information.

Next the script checks if the first and second characters returned by fgets() are dashes ("-"), thereby matching the item separator. The script achieves this by comparing the first byte of the string s ($s[0]) to dash and the second byte ($s[1]) to dash. If both match dash, the script continues to parse the item; otherwise, it assumes the data is corrupted and tells the Web reader this.

If the script has found an item, it reads the time line and the bytes written line. Like before, it only continues if both results are true. The script then outputs the time and bytes written lines intermixed with HTML tags.

Notice how simple it is to embed HTML into your scripts in this way. If you cannot work out how parser1.php is constructing the HTML table, have a look at the HTML source in your Web browser - you will notice that the loop constructs all the table cells.

The final point of interest in parser1.php is the number of bytes it reads from each line. Why 1024? The simple answer is that we do not want to assume that all lines will be short. This assumption would cause chaotic results if the files became corrupted, effectively breaking the data corruption detection.

Join the PC World newsletter!

Error: Please check your email address.

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Gavin Sherry

PC World
Show Comments

Cool Tech

Crucial Ballistix Elite 32GB Kit (4 x 8GB) DDR4-3000 UDIMM

Learn more >

Gadgets & Things

Lexar® Professional 1000x microSDHC™/microSDXC™ UHS-II cards

Learn more >

Family Friendly

Lexar® JumpDrive® S57 USB 3.0 flash drive 

Learn more >

Stocking Stuffer

Plox Star Wars Death Star Levitating Bluetooth Speaker

Learn more >

Christmas Gift Guide

Click for more ›

Most Popular Reviews

Latest News Articles


GGG Evaluation Team

Kathy Cassidy


First impression on unpacking the Q702 test unit was the solid feel and clean, minimalist styling.

Anthony Grifoni


For work use, Microsoft Word and Excel programs pre-installed on the device are adequate for preparing short documents.

Steph Mundell


The Fujitsu LifeBook UH574 allowed for great mobility without being obnoxiously heavy or clunky. Its twelve hours of battery life did not disappoint.

Andrew Mitsi


The screen was particularly good. It is bright and visible from most angles, however heat is an issue, particularly around the Windows button on the front, and on the back where the battery housing is located.

Simon Harriott


My first impression after unboxing the Q702 is that it is a nice looking unit. Styling is somewhat minimalist but very effective. The tablet part, once detached, has a nice weight, and no buttons or switches are located in awkward or intrusive positions.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?