We built a news site. Every day we will input tens of thousands data from web api.
In order to provide a precision search service, our table use MyISAM, build fulltext index (title, content, date). Our site now test in Godaddy VDS with 2GB RAM, 30GB space (No swap, because VDS do not allow to build swap).
with #grep “model name” /proc/cpuinfo
we get Godaddy use Intel(R) Xeon(R) CPU L5609 @ 1.87GHz
Here is our mysql input, use FROM dual
avoid insert duplicate record, and the table's FULLTEXT index always on.
INSERT INTO newstable
(title,link,content,date,source,image,imagesource)
SELECT '".$title."','".$link."','','".$content."','".$date."','".$source."','".$image."','".$imagesource."'
FROM dual WHERE not exists
(SELECT content FROM newstable WHERE newstable.content = '".$content."')
Here is our search query in the reading page (We have optimization the home page, it is a static page, be generated from crond, but the reading page should keep in for a live search):
SELECT id,title,link,content,date,source,image,imagesource
FROM newstable
WHERE (MATCH (title,content,date)
AGAINST ('$boolean' IN BOOLEAN MODE))
Order By date DESC Limit '.($_POST['number']).', 10
each page have 2 or 3 queries like above. (* I have renamed table name and field name)
For a news site, we need keep fresh news on the top site, so sort by date
is required.
Now, our problem is: Mysql full text search will cause high usage CPU
. use #top
for a server monitoring, open each page will cost neally 10% CPU
. I am afraid in this case, our site could only support few people online at the same time. But our goal is 100 people online at the same time at least. Many Thanks.
Cpu(s): 10.4%us, 1.4%sy, 0.0%ni, 88.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2097152k total, 570364k used, 1526788k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 0k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28265 mysql 15 0 385m 75m 5752 S 129.3 3.7 751:49.13 mysqld
1313 root 15 0 35040 18m 6400 S 7.0 0.9 0:03.55 php
1 root 15 0 2156 664 576 S 0.0 0.0 0:04.42 init
1215 root 15 -4 2260 652 436 S 0.0 0.0 0:00.00 udevd
1359 root 15 0 2240 1004 812 R 0.0 0.0 0:00.00 top
1585 root 25 0 2832 868 700 S 0.0 0.0 0:00.00 xinetd
...
EDIT: explain query result:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY newstable fulltext index_name index_name 0 1 Using where
EDIT2: ./mysqltuner.pl result
-------- General Statistics --------------------------------------------------
[--] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.5.20
[OK] Operating on 32-bit architecture with less than 2GB RAM
-------- Storage Engine Statistics -------------------------------------------
[--] Status: -Archive -BDB -Federated +InnoDB -ISAM -NDBCluster
[--] Data in MyISAM tables: 396M (Tables: 39)
[--] Data in InnoDB tables: 208K (Tables: 8)
[!!] Total fragmented tables: 9
-------- Security Recommendations -------------------------------------------
[!!] User '@ip-XX-XX-XX-XX.ip.secureserver.net'
[!!] User '@localhost'
-------- Performance Metrics -------------------------------------------------
[--] Up for: 17h 27m 58s (1M q [20.253 qps], 31K conn, TX: 513M, RX: 303M)
[--] Reads / Writes: 61% / 39%
[--] Total buffers: 168.0M global + 2.7M per thread (151 max threads)
[OK] Maximum possible memory usage: 573.8M (28% of installed RAM)
[OK] Slow queries: 0% (56/1M)
[!!] Highest connection usage: 100% (152/151)
[OK] Key buffer size / total MyISAM indexes: 8.0M/162.5M
[OK] Key buffer hit rate: 100.0% (2B cached / 882K reads)
[!!] Query cache is disabled
[OK] Sorts requiring temporary tables: 0% (0 temp sorts / 17K sorts)
[!!] Temporary tables created on disk: 49% (32K on disk / 64K total)
[!!] Thread cache is disabled
[!!] Table cache hit rate: 0% (400 open / 298K opened)
[OK] Open file limit used: 41% (421/1K)
[!!] Table locks acquired immediately: 77%
[OK] InnoDB data size / buffer pool: 208.0K/128.0M
-------- Recommendations -----------------------------------------------------
General recommendations:
Run OPTIMIZE TABLE to defragment tables for better performance
MySQL started within last 24 hours - recommendations may be inaccurate
Enable the slow query log to troubleshoot bad queries
Reduce or eliminate persistent connections to reduce connection usage
When making adjustments, make tmp_table_size/max_heap_table_size equal
Reduce your SELECT DISTINCT queries without LIMIT clauses
Set thread_cache_size to 4 as a starting value
Increase table_cache gradually to avoid file descriptor limits
Optimize queries and/or use InnoDB to reduce lock wait
Variables to adjust:
max_connections (> 151)
wait_timeout (< 28800)
interactive_timeout (< 28800)
query_cache_size (>= 8M)
tmp_table_size (> 16M)
max_heap_table_size (> 16M)
thread_cache_size (start at 4)
table_cache (> 400)
EDIT 3: my.cnf
[mysqld]
port = 3306
socket = /tmp/mysql.sock
skip-external-locking
key_buffer_size = 256M
max_allowed_packet = 16M
max_connections = 1024
wait_timeout = 5
table_open_cache = 512
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 2M
myisam_sort_buffer_size = 128M
thread_cache_size = 8
query_cache_size= 256M
# Try number of CPU's*2 for thread_concurrency
thread_concurrency = 8
ft_min_word_len = 2
read_rnd_buffer_size=2M
tmp_table_size=128M
Best Answer
A couple of strange things stand out here.
Grab mysqltuner.pl ( just type
wget mysqltuner.pl
) and run it over your database. It will most likely have some good suggestions.MySQL fulltext search is not the right way to handle this anyway. Sphinx or Lucene are both good projects for search.