1
forest520 2015-01-17 17:36:56 +08:00 via iPhone
100万很少了好不好
|
3
sleshep 2015-01-17 17:40:39 +08:00
你要做什么操作?
|
5
msg7086 2015-01-17 18:08:27 +08:00
问题不在于读入,而在于读入以后要干嘛。
mysql可以unbuffer发送数据,100万根本不是问题。 |
6
minbaby 2015-01-17 18:13:26 +08:00
while True:
data = select * from table limit offset, limit offset += limit if not data: break do_something() 不要砍我.…..我只会这么干 |
7
minbaby 2015-01-17 18:13:42 +08:00
我擦, 为什么缩进不见了
|
8
zjmdp OP @minbaby 这种方法我也想到过,但是nodejs天生异步io,这种同步方法有点尴尬,而且吞吐量明显受限
|
10
zjmdp OP @minbaby 你说的方法肯定是可行的,只是觉得稍复杂一点,不知道mysql有没有内建方法应付这种情况
|
11
EPr2hh6LADQWqRVH 2015-01-17 18:28:08 +08:00 via iPhone
on data
|
12
vivisidea 2015-01-17 20:19:39 +08:00
@zjmdp MySQL ResultSet有Stream Mode,用limit/offset方式的话offset大了之后效率会慢
<code> ResultSet By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate and, due to the design of the MySQL network protocol, is easier to implement. If you are working with ResultSets that have a large number of rows or large values and cannot allocate heap space in your JVM for the memory required, you can tell the driver to stream the results back one row at a time. To enable this functionality, create a Statement instance in the following manner: stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY); stmt.setFetchSize(Integer.MIN_VALUE); The combination of a forward-only, read-only result set, with a fetch size of Integer.MIN_VALUE serves as a signal to the driver to stream result sets row-by-row. After this, any result sets created with the statement will be retrieved row-by-row. </code> http://dev.mysql.com/doc/connector-j/en/connector-j-reference-implementation-notes.html |
13
lincanbin 2015-01-17 20:52:14 +08:00
流读取,不过还是得看你想遍历这些数据干什么
|
14
invite 2015-01-17 21:04:06 +08:00
这个需要SELECT COUNT(*) 么?
假设每次SELECT 一个A, 那在SELECT 的时候, LIMIT A + 1, 然后看看结果集有没有A+1, 没有就自动退出了. |
16
zjmdp OP @invite 但你说的这种方式对于每一次select都要等待前一次结束,但node天然异步IO,可以并发select,提高吞吐量
|
17
lianghui 2015-01-17 21:37:21 +08:00
以前遍历整个表的处理,如果有自增id,比如uid,
select min(uid) as start, max(uid) as end from table_name; 稍稍的评估下保证用户矢量的区间阈值,抽取保证适当的数据 current = start + threshold; while (current < end) { // to do current += threshold; current = (current <= end) ? current : end; } |