[ Index ] |
PHP Cross Reference of Phabricator |
[Summary view] [Print] [Text view]
1 @title Database Schema 2 @group developer 3 4 This document describes key components of the database schema and should answer 5 questions like how to store new types of data. 6 7 Database System 8 =============== 9 10 Phabricator uses MySQL or another MySQL-compatible database (like MariaDB 11 or Amazon RDS). 12 13 Phabricator the InnoDB table engine. The only exception is the 14 `search_documentfield` table which uses MyISAM because MySQL doesn't support 15 fulltext search in InnoDB (recent versions do, but we haven't added support 16 yet). 17 18 We are unlikely to ever support other incompatible databases like PostgreSQL or 19 SQLite. 20 21 PHP Drivers 22 =========== 23 24 Phabricator supports [[ http://www.php.net/book.mysql | MySQL ]] and 25 [[ http://www.php.net/book.mysqli | MySQLi ]] PHP extensions. 26 27 Databases 28 ========= 29 30 Each Phabricator application has its own database. The names are prefixed by 31 `phabricator_` (this is configurable). This design has two advantages: 32 33 - Each database is easier to comprehend and to maintain. 34 - We don't do cross-database joins so each database can live on its own 35 machine. This gives us flexibility in sharding data later. 36 37 Connections 38 =========== 39 40 Phabricator specifies if it will use any opened connection just for reading or 41 also for writing. This allows opening write connections to a primary and read 42 connections to a replica in primary/replica setups (which are not actually 43 supported yet). 44 45 Tables 46 ====== 47 48 Most table names are prefixed by their application names. For example, 49 Differential revisions are stored in database `phabricator_differential` and 50 table `differential_revision`. This generally makes queries easier to recognize 51 and understand. 52 53 The exception is a few tables which share the same schema over different 54 databases such as `edge`. 55 56 We use lower-case table names with words separated by underscores. 57 58 Column Names 59 ============ 60 61 Phabricator uses `camelCase` names for columns. The main advantage is that they 62 directly map to properties in PHP classes. 63 64 Don't use MySQL reserved words (such as `order`) for column names. 65 66 Data Types 67 ========== 68 69 Phabricator defines a set of abstract data types (like `uint32`, `epoch`, and 70 `phid`) which map to MySQL column types. The mapping depends on the MySQL 71 version. 72 73 Phabricator uses `utf8mb4` character sets where available (MySQL 5.5 or newer), 74 and `binary` character sets in most other cases. The primary motivation is to 75 allow 4-byte unicode characters to be stored (the `utf8` character set, which 76 is more widely available, does not support them). On newer MySQL, we use 77 `utf8mb4` to take advantage of improved collation rules. 78 79 Phabricator stores dates with an `epoch` abstract data type, which maps to 80 `int unsigned`. Although this makes dates less readable when browsing the 81 database, it makes date and time manipulation more consistent and 82 straightforward in the application. 83 84 We don't use the `enum` data type because each change to the list of possible 85 values requires altering the table (which is slow with big tables). We use 86 numbers (or short strings in some cases) mapped to PHP constants instead. 87 88 JSON and Other Serialized Data 89 ============================== 90 91 Some data don't require structured access -- we don't need to filter or order by 92 them. We store these data as text fields in JSON format. This approach has 93 several advantages: 94 95 - If we decide to add another unstructured field then we don't need to alter 96 the table (which is slow for big tables in MySQL). 97 - Table structure is not cluttered by fields which could be unused most of the 98 time. 99 100 An example of such usage can be found in column 101 `differential_diffproperty.data`. 102 103 Primary Keys 104 ============ 105 106 Most tables have auto-increment column named `id`. Adding an ID column is 107 appropriate for most tables (even tables that have another natural unique key), 108 as it improves consistency and makes it easier to perform generic operations 109 on objects. 110 111 For example, @{class:LiskMigrationIterator} allows you to very easily apply a 112 migration to a table using a constant amount of memory provided the table has 113 an `id` column. 114 115 Indexes 116 ====== 117 118 Create all indexes necessary for fast query execution in most cases. Don't 119 create indexes which are not used. You can analyze queries @{article:Using 120 DarkConsole}. 121 122 Older MySQL versions are not able to use indexes for tuple search: 123 `(a, b) IN ((%s, %d), (%s, %d))`. Use `AND` and `OR` instead: 124 `((a = %s AND b = %d) OR (a = %s AND b = %d))`. 125 126 Foreign Keys 127 ============ 128 129 We don't use foreign keys because they're complicated and we haven't experienced 130 significant issues with data inconsistency that foreign keys could help prevent. 131 Empirically, we have witnessed first hand as `ON DELETE CASCADE` relationships 132 accidentally destroy huge amounts of data. We may pursue foreign keys 133 eventually, but there isn't a strong case for them at the present time. 134 135 PHIDs 136 ===== 137 138 Each globally referencable object in Phabricator has its associated PHID 139 ("Phabricator ID") which serves as a global identifier, similar to a GUID. 140 We use PHIDs for referencing data in different databases. 141 142 We use both autoincrementing IDs and global PHIDs because each is useful in 143 different contexts. Autoincrementing IDs are meaningfully ordered and allow 144 us to construct short, human-readable object names (like `D2258`) and URIs. 145 Global PHIDs allow us to represent relationships between different types of 146 objects in a homogeneous way. 147 148 For example, infrastructure like "subscribers" can be implemented easily with 149 PHID relationships: different types of objects (users, projects, mailing lists) 150 are permitted to subscribe to different types of objects (revisions, tasks, 151 etc). Without PHIDs, we would need to add a "type" column to avoid ID collision; 152 using PHIDs makes implementing features like this simpler. 153 154 Transactions 155 ============ 156 157 Transactional code should be written using transactions. Example of such code is 158 inserting multiple records where one doesn't make sense without the other or 159 selecting data later used for update. See chapter in @{class:LiskDAO}. 160 161 Advanced Features 162 ================= 163 164 We don't use MySQL advanced features such as triggers, stored procedures or 165 events because we like expressing the application logic in PHP more than in SQL. 166 Some of these features (especially triggers) can also cause a great deal of 167 confusion, and are generally more difficult to debug, profile, version control, 168 update, and understand than application code. 169 170 Schema Denormalization 171 ====================== 172 173 Phabricator uses schema denormalization sparingly. Avoid denormalization unless 174 there is a compelling reason (usually, performance) to denormalize. 175 176 Schema Changes and Migrations 177 ============================= 178 179 To create a new schema change or migration: 180 181 **Create a database patch**. Database patches go in 182 `resources/sql/autopatches/`. To change a schema, use a `.sql` file and write 183 in SQL. To perform a migration, use a `.php` file and write in PHP. Name your 184 file `YYYYMMDD.patchname.ext`. For example, `20141225.christmas.sql`. 185 186 **Keep patches small**. Most schema change statements are not transactional. If 187 a patch contains several SQL statements and fails partway through, it normally 188 can not be rolled back. When a user tries to apply the patch again later, the 189 first statement (which, for example, adds a column) may fail (because the column 190 already exists). This can be avoided by keeping patches small (generally, one 191 statement per patch). 192 193 **Use namespace and character set variables**. When defining a `.sql` patch, 194 you should use these variables instead of hard-coding namespaces or character 195 set names: 196 197 | Variable | Meaning | Notes | 198 |---|---|---| 199 | {$NAMESPACE} | Storage Namespace | Defaults to `phabricator` | 200 | {$CHARSET} | Default Charset | Mostly used to specify table charset | 201 | {$COLLATE_TEXT} | Text Collation | For most text (case-sensitive) | 202 | {$COLLATE_SORT} | Sort Collation | For sortable text (case-insensitive) | 203 | {$CHARSET_FULLTEXT} | Fulltext Charset | Specify explicitly for fulltext | 204 | {$COLLATE_FULLTEXT} | Fulltext Collate | Specify explicitly for fulltext | 205 206 207 **Test your patch**. Run `bin/storage upgrade` to test your patch. 208 209 See Also 210 ======== 211 212 - @{class:LiskDAO}
title
Description
Body
title
Description
Body
title
Description
Body
title
Body
Generated: Sun Nov 30 09:20:46 2014 | Cross-referenced by PHPXref 0.7.1 |