The concept of a prepared statement is that it a SQL statement that is lacking any parameters or literals and with these being replaced by placeholders. Then the statement is parsed and optimized without those values, and the placeholders are referenced to program variables. Once this is done, to execute the prepared statement, you set the program variables to the appropriate values and then execute the statement. You may change the program variables referencing the placeholder values and re-execute the SQL statement as many times as you want, without having to re-parse or optimize the statement.
The different steps then are:
- Prepare the statement - You prepare the statement with the appropriate placeholders (which are question marks), like: SELECT foo FROM bar WHERE foobar = ?
- Bind the placeholder values to program variables.
- Bind the result columns to program variables (if there are any columns returned that is).
- Set the program variables that are bound to the statement placeholder literals.
- Execute the prepared statement.
- If there are columns returned, i.e. this is a SELECT or SHOW statement for example, store the result, fetch the rows and the free the result.
- Repeat from step 4 until done.
- Free the prepared statement.
Working with prepared statement in MySQL, as in most other database systems, require a bit more work than playing with simple non-prepared statements. The prepare phase is done with the mysql_stmt_prepare() call, which use a MYSQL_STMT to keep track of the statement, and which has been initialized with a call to mysql_stmt_init().
The next step is to bind the placeholders, and this is done with a call to mysql_stmt_bind_param(), which takes the MYSQL_STMT struct as above, as well as an array of MYSQL_BIND structs, each of them referencing one bin parameter in the statement, and each to be filled in with information to the MySQL server about where the program variable is, what type it has and a pointer to an indicator of the NULL / NOT NULL status if this variable. Note that to pass a NULL, you do not pass the string NULL, which will do just that: Use the string "NULL", instead you MUST set the is_null member indicator of the MYSQL_BIND struct.
After binding parameters, you have to bind result columns, if there are any columns returned. The way this is done with the MySQL C API looks a bit awkward, but not really complicated. In essence, you call mysql_stmt_result_metadata() to get a MYSQL_RES, which is a familiar struct for any user of the classic MySQL C API, and then you use that to get any information you want on the result data from the statement.
You can nod bind the result columns, and again you use an array of MYSQL_BIND structs for this. Note that you have to pass appropriate data, like space for the returned column data and for a NULL indicator. Failure to do so will cause a segmentation fault and may in extreme cases cause really bad karma.
This was the complex part, now all you have to do is set the program variables referenced by the MYSQL_BIND array for the parameters, call mysql_stmt_execute() to execute the statement and then, if there are columns returned, call mysql_stmt_store_result() to get the result, then call mysql_stmt_fetch() to get each row, and for every row, you can look at the data referenced by the MYSQL_BIND struct array you passed for the column data. When you are done with the result, call mysql_stmt_free_result() and that's about it.
Well, to be honest, there is a bit more to it, but this is the basics. So what are the advantages? It really should be faster, as there is a lot less parsing and optimization going on, and in many database systems, using prepared statements increases performance A LOT. Not so with MySQL, probably due to the limited amount of optimization we do, and the rather basic SQL syntax we support. But there are cases where there are distinct advantages, in particular SELECTs. One reason for this is that a SELECT is usually a more complex SQL than, say, an INSERT or a SELECT (and mostly also an UPDATE), whcih means there is more to gain from limited time in the parser and optimizer. Also, a SELECT is in the same way also more complicated to optimize. I have done some simple tests, and has seen some 50% performance gain when using prepared statements with a still simple SELECT with one join. On the other hand, I have yet to see any advantages for INSERTs, but also no performance loss either, instead, I could see just about the same performance.
And are there disadvantages? Yes, I'm afraid so. For CALL statements, in/out parameters are not supported, nor is handling result sets from prepared statements. In addition, multiple statement results are not supported at all when using the prepared statement C API. Read more on the limitations in the MySQL manual.
These limitations are the worst aspects of using prepared statements, and I really hope we will be able to fix them soon.
/Karlsson