- — 18 October, 2002 11:48
Many people are first exposed to databases by playing with Access, which comes included in many versions of Microsoft Office. Access is a powerful product aimed primarily at developers, and many novices come away from dabbling with it determined to stick to more user-friendly packages such as Excel.
That attitude might be OK for individual users, but most small businesses will eventually find themselves contemplating database software as their needs grow and their companies expand. But before you can even go about installing a database, you need to grapple with such confusing concepts as database servers, relational and object databases, and the thorny question of why apparently similar products range in price from free to hundreds of thousands of dollars.
This Buying Guide will help you through the database minefield. While picking a database server will never be as quick a decision as, say, buying a printer, it can be a highly valuable purchase if you go about it the right way.
What is a database? (Back to contents)
A database is simply an organised set of information. Strictly speaking, there's no need for it to even be in computerised form - address books are often used as a simple example of a database, since they contain large volumes of information organised into categories (name, address, and phone number). Despite this, most people now take the term 'database' to refer to information stored electronically.
Note also that the general definition of a database given here could encompass many common PC applications, such as electronic mail (which can be organised by recipient or sender) or information in a spreadsheet (which tends to be structured mathematically). In practice, the term 'database' is most commonly used to refer to highly structured information (examples might include order forms or medical information). Most business databases will contain a carefully planned set of information that can be analysed to indicate overall trends, as well as providing a historical record of past transactions and activities.
A basic database system simply allows you to enter and search for information (a process often known as querying). Most modern databases also support the development of specific applications that run on top of the database, which enable you to access the features you need without having to worry about all the complexities of the system. For instance, within one company general staff might have access to an order entry system, while financial staff have access to automatic reporting systems that provide sales summaries and other information. Both use a database server to store information, but the means of entering this, and the ability to change it after entry, will vary between the different applications.
What is a database server? (Back to contents)A single-person business could run database software on just one PC quite easily. However, in most businesses data will need to be accessed by multiple staff across a network, often simultaneously. Database server software handles this task, making databases available to all relevant employees. This is not simply a matter of creating network connections. Database server software must ensure that information isn't modified by multiple staff members at once, index and sort the information that is entered so it can be easily searched, and allow analysis of the information stored in the database to produce reports. It also allows different applications to use the same centralised storage mechanism. With appropriate hardware, a database server can handle as many of these options as are needed. Staff may use different applications to access this information, but the server will handle the core data storage. Since databases were one of the earliest computer applications to be developed, they have become highly sophisticated over the years. Modern database servers can handle huge volumes of information and present it in a variety of formats, including automatically outputting Web pages from stored data and allowing the placing of orders and other electronic commerce tasks. For many businesses, a database server in some form will lie at the heart of most daily activities. A note about the word server is in order here. Database server software is distinct both from server hardware (which is the physical machine or machines used to run a network) and the server operating system (which runs on the server hardware and provides an interface between it and specific applications, of which a database server is one example). Despite this distinction, in practice all three will work together closely, and your choice of database server will influence your server OS and hardware choices.
The types of databases available (Back to contents)As database software has evolved, a number of different approaches to storing and linking information have emerged. Some of the more common include: Relational databases. A relational database creates formal definitions of all the included items in a database, setting them out in tables, and defines the relationship between them. For instance, a typical business database would include tables for defining both customers and orders. Using ids or keys, the two tables can be related together. Such databases are called 'relational' because they explicitly define these connections (an order form can look up customer details from the customer table rather than having to store the information twice). Most relational databases now make use of SQL to handle queries (discussed in more detail below). Currently these are the most common form of database. Object databases. Object databases store data in discreet, self-contained units - objects. These objects have specific data, attributes and behaviours associated with them. An extremely simple example might be a product database with a shirt object, which has attributes such as size, colour, and price. In practical use, the main difference between object and relational databases is the way in which data is accessed. Programmers use object-oriented programming languages to access the data objects from the object database by calling methods in their code. This takes much of the information that would have resided in the application code and transfers that information to the object database. Thus the application code is simplified. However, at the same time the fact that the database and application are tightly entwined can make accessing the data outside of the application more complex. Object-relational databases. Object-relational databases attempt to combine object and relational approaches. This allows the benefits of using objects where necessary to be tied to the strengths of relational databases. Hierarchical databases. While relational databases arrange data in tabular format, hierarchical databases arrange them in a tree format, with a parent node leading to further child nodes (which in turn may have further nodes of their own). The model is very similar to the way in which a program such as Windows Explorer displays the contents of a hard drive (double-clicking on a parent directory leads down the tree to further information, and so on down the directory tree). This allows for multiple types of subsidiary data, but makes it difficult to identify complex multiple relationships between individual data items (just as there is no obvious link between two subdirectories on a hard drive). Until recently, hierarchical databases have been more common in computer science fields than in real-world applications. However, hierarchical methods have become more popular with the emergence of XML (Extensible Markup Language), which uses a hierarchical structure, as a common data exchange format.
What do I need to know about SQL? (Back to contents)SQL (Structured Query Language) is used by relational databases to define queries and help generate reports. First developed in 1976, it provides a standardised means of sending queries to relational databases. SQL also defines more fundamental elements of databases, such as data types. SQL has become a dominant standard in the world of database development, since it allows developers to use the same basic constructions to query data from a wide variety of systems. The central functions of SQL have been defined by international standards organisations: originally the American National Standards Institute (ANSI), and subsequently the International Standards Organisation (ISO). Like most computer-based standards, SQL also comes in a number of flavours. The two most recent iterations are SQL-99 (also sometimes known as SQL-3) and SQL-92, both named for the years in which they were first released. (SQL-99 had been originally planned for release in 1996). While SQL-92 defined three individual levels of compliance (basic, intermediate and advanced), SQL-99 reverted to the model used in previous versions of providing just one large feature list. More importantly, SQL-99 also added some basic support for object features, extending SQL's usefulness and blurring the distinction between relational and object approaches to databases. It is comparatively rare for vendors to implement the precise standards laid down for SQL, which is a complex standard running into thousands of pages. Several companies choose not to implement every aspect of the existing standards, arguing that the functions in question are rarely if ever needed by developers or users. Simultaneously, many provide additional, functions (known as extensions) to make particular tasks easier. Despite what you might suspect, many companies will simultaneously add their own extensions while ignoring some aspects of the basic standard. Whether these additions and exclusions are important to your business will depend on the exact mix of applications you wish to run or develop, and what existing applications you already have in place.
What other features should I look for in a database? (Back to contents)When choosing a database server, the most important consideration will be whether it can deliver the specific application functions you require. These may be supplied in the form of pre-packaged software, or you may choose to develop your own (or more likely hire a consultant to do so). This may involve considerable expenditure, but this needs to be balanced against improved staff productivity and the ability to more accurately analyse your business. Standards compliance. As discussed above, SQL support varies widely between database server suppliers. Databases which comply with SQL should allow relatively straightforward data exchange, so SQL compliance is important, especially in environments running more than one operating system. Security systems. Databases often store highly valuable and sensitive commercial information, so it's important that there is some security system in place, even if this is only a basic username/password system. Most database servers will provide audit trails, allowing you to see who has entered, accessed or modified information. If your database server is going to be exposed to the Internet, then security mechanisms will need to be more robust and you will need to consider whether other mechanisms (such as encryption) are necessary as well. Performance features. Databases are generally critical applications, and even a brief outage can be harmful to your business. To help prevent this, modern database servers have borrowed many features from the world of general network operating systems, including fault tolerance (systems to keep the server running in the event of unexpected errors) and load balancing (which allows database queries on high-volume systems to be handled by multiple servers, improving performance and response times). These are unlikely to be needed if you're just running a single database server, but as your needs expand they are likely to become more crucial considerations. If your database server is used for e-commerce applications, these availability issues will take on a heightened degree of importance. While you may not require all these features immediately, you should consider future needs as well as your current plans. Retrofitting these features to your system is likely to be more difficult than installing a database server that supports them from the beginning, even if it takes time for you to actively deploy them. XML: If you will be doing data interchange now or in the future you may want to look for a database that allows extraction of data as XML. This may make using XML as a data interchange format simpler.
The database players (Back to contents)
Reflecting its long heritage, literally dozens of companies offer database server solutions. Some are tailored to individual markets; some are designed to work closely with existing application and OS software; some are aimed at people building their own specific applications, while others have many commercial add-on applications available. A partial, alphabetised list of major companies in the space would include Borland, IBM, Informix, Microsoft, MySQL, Oracle, Red Hat, SAP and Sybase. Many other companies supply applications that will work with database servers from these providers.
What hardware will I need? (Back to contents)There's no single answer to this question. You will need to work out how many staff members are likely to connect to the database, and how many transactions (either looking up existing data, modifying it or entering new information) will take place. These will provide you with broad parameters for selecting an appropriate database server, and in turn give you a good indication of what your server requirements will be. As with many computing tasks, the more memory, processing power and disk storage you can provide on the server, the better. Don't neglect network connection speeds, either; if you have a high-powered server but a poorly configured network, you'll lose most of the advantage waiting for data to be sent back to you.
How much should I expect to pay? (Back to contents)This is also a difficult question. Database server prices vary hugely. At one end of the spectrum, open source database solutions are available for no up-front cost. At the other extreme, specialised database solutions can cost hundreds of thousands of dollars. Two broad parameters will guide cost: the number of supported features, and the number of connected users. A system that is designed to serve just a few dozen staff and with minimal security features should be cheaper than one that handles thousands of simultaneous transactions and also powers a number of Web sites. The other important factor to consider is the degree to which your database accessing application will be customised. If you can make use of an off-the-shelf application, this will generally be cheaper than having a developer build a custom system for your business. Even a 'free' database system using readily available software will still require an investment in training for your staff, however.