Working with structured data

In this column we look at writing data to files in a structured format and how to read it back in that format.

Python vs. PHP: Choosing your next project's language

Writing To Files

To read files, we have focused on the fgets() function. This function relies on a file descriptor in order to know from which file to read, and which line of a file to read. Likewise, we will use the appropriately named fputs() function to write to a file. The following example shows how easy it is to use fputs().

$s = "The time is ".date('D M d H:i:s T Y')."\n";
$fp = fopen("./","a");
$i = fputs($fp,"--\n".$s);
fputs($fp,"Wrote $i bytes to file\n");
echo "Wrote $i bytes to file\n";

View this script from your PHP-enabled Web server (see the cover CD for information on setting up a PHP-enabled Web server if you do not have one). This script writes three lines of data to a file: a separator, the time and the number of bytes written to the file on the second line (collectively, an 'item'). It also outputs this final line to the Web user.

Notice that the script opens in mode 'a' - that is, opened for writing only, created if it does not exist with a file pointer placed after the last line of the file. This means that when fputs.php is viewed a second time, the latest time will be appended to the end of A useful exercise would be to change this script to place the most recent write to at the beginning of the file.

You may also have noticed that the file pointer is closed with the fclose() function. It is important that this takes place when writing to a file opened with fopen(), since if the file pointer is not closed all data written to the file will be lost.

Reading Structured Data

In order to read back the data from one item at a time, we will need to approach the reading of the file differently than last month. This requires the construction of an algorithm that locates the first line of an item by the "--\n" separator and assumes that the next two lines are the time and bytes written strings (more sophisticated algorithms in later months will not make such assumptions). This kind of algorithm is called a parser (pronounced 'par-zer').

$fp = fopen("./","r");
while(($s = fgets($fp,1024)) && !feof($fp)) {
if($s[0] == '-' && $s[1] == '-') {
if(($time = fgets($fp,1024)) && ($bytes = fgets($fp,1024))) {
echo "<TR>\n<TD>\n$time</TD>\n\n";
echo "<TD>\n$bytes</TD>\n</TR>\n";
} else {
echo "<TR><TD>Data file corrupted!\n</TD></TR>";
} else {
echo "<TR><TD>Data file corrupted!\n</TD></TR>";

Save this as parser1.php in the same directory as the file and request it from your Web server. All the items you have created with fputs.php are returned in an HTML table, which reflects the structure of our data.

parser1.php introduces a few new features of PHP as well as the concept of parsing. To look at this step by step: first, the script opens file for reading and places the file pointer at the beginning of the file. It then enters into a loop reading one line at a time. The conditions for the loop continuing are a) that the call to fgets() returns a true result (that is, it reads from and b) the current file pointer is not at the end of the file.

These two conditions are conjoined by a 'logical AND' - &&. That is, only when both conditions are true is the loop executed (a logical OR, denoted by ||, would have allowed either value to be true for the loop to be executed). See for more information.

Next the script checks if the first and second characters returned by fgets() are dashes ("-"), thereby matching the item separator. The script achieves this by comparing the first byte of the string s ($s[0]) to dash and the second byte ($s[1]) to dash. If both match dash, the script continues to parse the item; otherwise, it assumes the data is corrupted and tells the Web reader this.

If the script has found an item, it reads the time line and the bytes written line. Like before, it only continues if both results are true. The script then outputs the time and bytes written lines intermixed with HTML tags.

Notice how simple it is to embed HTML into your scripts in this way. If you cannot work out how parser1.php is constructing the HTML table, have a look at the HTML source in your Web browser - you will notice that the loop constructs all the table cells.

The final point of interest in parser1.php is the number of bytes it reads from each line. Why 1024? The simple answer is that we do not want to assume that all lines will be short. This assumption would cause chaotic results if the files became corrupted, effectively breaking the data corruption detection.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.
Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Gavin Sherry

PC World
Show Comments

Brand Post

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Maryellen Rose George

Brother PT-P750W

It’s useful for office tasks as well as pragmatic labelling of equipment and storage – just don’t get too excited and label everything in sight!

Cathy Giles

Brother MFC-L8900CDW

The Brother MFC-L8900CDW is an absolute stand out. I struggle to fault it.

Luke Hill


I need power and lots of it. As a Front End Web developer anything less just won’t cut it which is why the MSI GT75 is an outstanding laptop for me. It’s a sleek and futuristic looking, high quality, beast that has a touch of sci-fi flare about it.

Emily Tyson

MSI GE63 Raider

If you’re looking to invest in your next work horse laptop for work or home use, you can’t go wrong with the MSI GE63.

Laura Johnston

MSI GS65 Stealth Thin

If you can afford the price tag, it is well worth the money. It out performs any other laptop I have tried for gaming, and the transportable design and incredible display also make it ideal for work.

Andrew Teoh

Brother MFC-L9570CDW Multifunction Printer

Touch screen visibility and operation was great and easy to navigate. Each menu and sub-menu was in an understandable order and category

Featured Content

Product Launch Showcase

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?