carbondata-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kunalkap...@apache.org
Subject [carbondata] branch master updated: [CARBONDATA-3362] Document update for pagesize table property scenario
Date Thu, 16 May 2019 09:38:20 GMT
This is an automated email from the ASF dual-hosted git repository.

kunalkapoor pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
     new bf3ce9d  [CARBONDATA-3362] Document update for pagesize table property scenario
bf3ce9d is described below

commit bf3ce9d557d2ccf3656e5a9b7152955360cddaae
Author: ajantha-bhat <ajanthabhat@gmail.com>
AuthorDate: Tue May 7 14:36:05 2019 +0530

    [CARBONDATA-3362] Document update for pagesize table property scenario
    
    Document update for pagesize table property scenario.
    
    This closes #3206
---
 docs/carbon-as-spark-datasource-guide.md | 2 +-
 docs/ddl-of-carbondata.md                | 5 +++++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/carbon-as-spark-datasource-guide.md b/docs/carbon-as-spark-datasource-guide.md
index 598acb0..fe46b09 100644
--- a/docs/carbon-as-spark-datasource-guide.md
+++ b/docs/carbon-as-spark-datasource-guide.md
@@ -44,7 +44,7 @@ Now you can create Carbon table using Spark's datasource DDL syntax.
 |-----------|--------------|------------|
 | table_blocksize | 1024 | Size of blocks to write onto hdfs. For  more details, see [Table
Block Size Configuration](./ddl-of-carbondata.md#table-block-size-configuration). |
 | table_blocklet_size | 64 | Size of blocklet to write. |
-| table_page_size_inmb | 0 | Size of each page in carbon table, if page size crosses this
value before 32000 rows, page will be cut to that may rows. Helps in keep page size to fit
cache size |
+| table_page_size_inmb | 0 | Size of each page in carbon table, if page size crosses this
value before 32000 rows, page will be cut to that many rows. Helps in keep page size to fit
cache size |
 | local_dictionary_threshold | 10000 | Cardinality upto which the local dictionary can be
generated. For  more details, see [Local Dictionary Configuration](./ddl-of-carbondata.md#local-dictionary-configuration).
|
 | local_dictionary_enable | false | Enable local dictionary generation. For  more details,
see [Local Dictionary Configuration](./ddl-of-carbondata.md#local-dictionary-configuration).
|
 | sort_columns | all dimensions are sorted | Columns to include in sort and its order of
sort. For  more details, see [Sort Columns Configuration](./ddl-of-carbondata.md#sort-columns-configuration).
|
diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index 5bc8f10..34eca8d 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -291,6 +291,11 @@ CarbonData DDL statements are documented here,which includes:
      If page size crosses this value before 32000 rows, page will be cut to that many rows.

      Helps in keeping page size to fit cpu cache size.
 
+     This property can be configured if the table has string, varchar, binary or complex
datatype columns.
+     Because for these columns 32000 rows in one page may exceed 1755 MB and snappy compression
will fail in that scenario.
+     Also if page size is huge, page cannot be fit in CPU cache. 
+     So, configuring smaller values of this property (say 1 MB) can result in better use
of CPU cache for pages.
+
      Example usage:
      ```
      TBLPROPERTIES ('TABLE_PAGE_SIZE_INMB'='5')


Mime
View raw message