General Properties(通用屬性)
Property
描述
Description
Values
name
主要用來定義field名稱, field型態取決於屬性"type", 在定義
名稱上建議使用不以數字為開頭的英文字母、數字、 下底線
所組合而成的命名。
The name of the fieldType. This value gets used in field definitions, in
the "type" attribute. It is strongly recommended that names consist of
alphanumeric or underscore characters only and not start with a digit.
This is not currently strictly enforced.class
儲存資料及索引資料用的類別, 請注意有些類別以solr為前綴
的類名...例如"solr.TextField", 該類型會生效是因為solr本身
會自動搜尋帶有solr前綴field的名稱後,尋找內部對應的類別,
如果為第三方的類別庫,則必需打上完整的類別名稱。 例如:
"org.apache.solr.schema.TextField"
但這個完整引用field等同於 "solr.TextField"
The class name that gets used to store and index the data for this type.
Note that you may prefix included class names with "solr." and Solr will
automatically figure out which packages to search for the class - so
"solr.TextField" will work. If you are using a third-party class, you will
probably need to have a fully qualified class name. The fully qualified
equivalent for "solr.TextField" is "org.apache.solr.schema.TextField".positionIncrementGap
指定field值之間的間距, 以防止多值field雜亂無章的語句匹配
。
For multivalued fields, specifies a distance between multiple values,
which prevents spurious phrase matchesinteger
autoGeneratePhraseQueries
對於文字field。如果為true,Solr將自動生成語句查詢 相鄰的
條件。如果為false,則必須用雙引號括住作為短語句的另一個
條件。
For text fields. If true, Solr automatically generates phrase queries for
adjacent terms. If false, terms must be enclosed in double-quotes to be
treated as phrases.true or
falsedocValuesFormat
自定義文件格式器,這需要一個感知模式編解碼器,例如配置
在solrconfig.xml中的SchemaCodecFactory.
Defines a custom DocValuesFormat to use for fields of this type. This
requires that a schema-aware codec, such as the SchemaCodecFacto
ry has been configured in solrconfig.xml.n/a
postingsFormat
自定義張貼格式器,這需要一個感知模式編解碼器, 例如配置
在solrconfig.xml中的SchemaCodecFactory.
Defines a custom PostingsFormat to use for fields of this type. This
requires that a schema-aware codec, such as the SchemaCodecFacto
ry has been configured in solrconfig.xml.n/a
Field Element Default Properties (Field元素預設屬性)
Property
描述
Description
Values
indexed
如果為true,該field可被檢索查詢。
If true, the value of the field can be used in queries to retrieve matching
documentstrue or
falsestored
如果為true,會將field的內容進行儲存,
而且在檢索同時也會回傳該field的原始內容。
If true, the actual value of the field can be retrieved by queries
true or
falsedocValues
如果為true,field值將放置在以DocValues結構中的導向列。
If true, the value of the field will be put in a column-oriented DocValues str
ucturetrue or
falsesortMissingFirst
sortMissingLast當排序field不存在時,控制文件放置的位置. 於Solr的3.5後
適用於所有的數字型field、日期型field和包括Trie。
Control the placement of documents when a sort field is not present. As of
Solr 3.5, these work for all numeric fields, including Trie and date fields.true or
falsemultiValued
如果為true,表示這個filed所儲存的資料為多筆記錄。
If true, indicates that a single document might contain multiple values for
this field typetrue or
falseomitNorms(忽略準則)
如果為true,省略了該領域相關的規範(這將停用正常長度
索引時增壓field,並節省一些記憶體)。
為了所有的原始(未分析)field類型,所以預設為true,例
如整數、浮點、數據、布爾和字符串。
只有text-field或fields需要 索引時需要升壓規範。
If true, omits the norms associated with this field (this disables length
normalization and index-time boosting for the field, and saves some
memory). Defaults to true for all primitive (non-analyzed) field types, such
as int, float, data, bool, and string. Only full-text fields or fields that need
an index-time boost need norms.true or
falseomitTermFreqAndPositions
(忽略詞頻率和立場)
如果為true,省略了field的詞頻信息、定位、及為了此field的有
效負載張貼。一但少了不需要的field信息,有可能提升額外的性
能,而且也減少了所需要索引的存儲空間。 如果依賴已發行的field
而使用此選項的查詢,會導致無法找到文件。在非 "text" field中,
屬性默認為true。
If true, omits term frequency, positions, and payloads from postings for
this field. This can be a performance boost for fields that don't require that
information. It also reduces the storage space required for the index.
Queries that rely on position that are issued on a field with this option will
silently fail to find documents. This property defaults to true for all fields
that are not text fields.true or
falseomitPositions(忽略位置)
類似 omitTermFreqAndPositions 但保留詞頻信息。
Similar to omitTermFreqAndPositions but preserves term frequency
informationtrue or
falsetermVectors
termPositions
termOffsets這個操作指示 Solr 去維護每一個文件的向量週期,任何選擇的
每一個期限出現向量所包含的位置及偏移量信訊。這些可以用
來加速高亮和其它輔助功能,但需要在索引大小方面增加相當
程度的成本。他們是Solr沒有必要的典型用法。
These options instruct Solr to maintain full term vectors for each
document, optionally including the position and offset information for each
term occurrence in those vectors. These can be used to accelerate
highlighting and other ancillary functionality, but impose a substantial cost
in terms of index size. They are not necessary for typical uses of Solrtrue or
falserequired
如果為true,強制該Field資料是必需的,如果為NULL值會出錯
。
當使用DB匯入有一對多關聯關系時,這個選項建議為false
Instructs Solr to reject any attempts to add a document which does not
have a value for this field. This property defaults to false.true or
falseField Types Included with Solr
以下為Solr目前可以使用的field型態列表.
這些列表classes都包含在 org.apache.solr.schema package
了解這些類別,有助於幫助建立自訂的 FieldType elements。
Class Description BCDIntField
Binary-coded decimal (BCD) integer. BCD is a relatively inefficient
encoding that offers the benefits of quick decimal calculations and quick
conversion to a string. This field has been deprecated and will be
removed in Solr 5.0, use TrieIntField instead.BCDLongField
Binary-coded decimal long integer. This field has been deprecated and
will be removed in Solr 5.0, use TrieLongField instead.BCDStrField
Binary-coded decimal string. This field has been deprecated and will be
removed in Solr 5.0, use TrieIntField instead.BinaryField
Binary data.
BoolField
Contains either true or false. Values of "1", "t", or "T" in the first character
are interpreted as true. Any other values in the first character are
interpreted as false.ByteField
Contains a byte (an 8-bit signed integer). This field has been deprecated
and will be removed in Solr 5.0, use TrieIntField instead.CollationField
Supports Unicode collation for sorting and range queries.
ICUCollationField is a better choice if you can use ICU4J. See the section
Unicode Collation.CurrencyField
Supports currencies and exchange rates. See the section Working with
Currencies and Exchange Rates.DateField
已不建議使用,請改用TrieDateField
Represents a point in time with millisecond precision. See the section Wor
king with Dates. This field has been deprecated and will be removed in
Solr 5.0, use TrieDateField instead.DoubleField
Double (64-bit IEEE floating point). This field has been deprecated and
will be removed in Solr 5.0, use TrieDoubleField instead.ExternalFileField
Pulls values from a file on disk. See the section Working with External
Files and Processes.EnumField
Allows defining an enumerated set of values which may not be easily
sorted by either alphabetic or numeric order (such as a list of severities,
for example). This field type takes a configuration file, which lists the
proper order of the field values. See the section Working with Enum
Fields for more information.FloatField
Floating point (32-bit IEEE floating point). This field has been deprecated
and will be removed in Solr 5.0, use TrieFloatField instead.ICUCollationField
Supports Unicode collation for sorting and range queries. See the section
Unicode Collation.IntField
Integer (32-bit signed integer). This field has been deprecated and will be
removed in Solr 5.0, use TrieIntField instead.LatLonType
Spatial Search: a latitude/longitude coordinate pair. The latitude is
specified first in the pair.LongField
Long integer (64-bit signed integer). This field has been deprecated and
will be removed in Solr 5.0, use TrieLongField instead.PointType
Spatial Search: An arbitrary n-dimensional point, useful for searching
sources such as blueprints or CAD drawings.PreAnalyzedField
Provides a way to send to Solr serialized token streams, optionally with
independent stored values of a field, and have this information stored and
indexed without any additional text processing. Useful if you want to
submit field content that was already processed by some existing external
text processing pipeline (e.g. tokenized, annotated, stemmed, inserted
synonyms, etc.), while using all the rich attributes that Lucene's TokenSt
ream provides via token attributes.RandomSortField
Does not contain a value. Queries that sort on this field type will return
results in random order. Use a dynamic field to use this feature.ShortField
Short integer. This field has been deprecated and will be removed in Solr
5.0, use TrieIntField instead.SortableDoubleField
The Sortable fields provide correct numeric sorting. This field has been
deprecated and will be removed in Solr 5.0, use TrieDoubleField instead.SortableFloatField
Numerically sorted floating point. This field has been deprecated and will
be removed in Solr 5.0, use TrieFloatField instead.SortableIntField
Numerically sorted integer. This field has been deprecated and will be
removed in Solr 5.0, use TrieIntField instead.SortableLongField
Numerically sorted long integer. This field has been deprecated and will
be removed in Solr 5.0, use TrieLongField instead.SpatialRecursivePrefixTreeFieldType
(RPT for short) Spatial Search: Accepts latitude comma longitude strings
or other shapes in WKT format.StrField
String (UTF-8 encoded string or Unicode).
TextField
Text, usually multiple words or tokens.
TrieDateField
Date field. Represents a point in time with millisecond precision. See the
section Working with Dates. precisionStep="0" enables efficient date
sorting and minimizes index size; precisionStep="8" (the default)
enables efficient range queries.TrieDoubleField
Double field (64-bit IEEE floating point). precisionStep="0" enables
efficient numeric sorting and minimizes index size; precisionStep="8"
(the default) enables efficient range queries.TrieField
If this field type is used, a "type" attribute must also be specified, valid
values are: integer, long, float, double, date. Using this field is the
same as using any of the Trie fields. precisionStep="0" enables
efficient numeric sorting and minimizes index size; precisionStep="8"
(the default) enables efficient range queries.TrieFloatField
Floating point field (32-bit IEEE floating point). precisionStep="0" en
ables efficient numeric sorting and minimizes index size; precisionSte
p="8" (the default) enables efficient range queries.TrieIntField
Integer field (32-bit signed integer). precisionStep="0" enables
efficient numeric sorting and minimizes index size; precisionStep="8"
(the default) enables efficient range queries.TrieLongField
Long field (64-bit signed integer). precisionStep="0" enables efficient
numeric sorting and minimizes index size; precisionStep="8" (the
default) enables efficient range queries.UUIDField
Universally Unique Identifier (UUID). Pass in a value of "NEW" and Solr
will create a new UUID. Note: configuring a UUIDField instance with a
default value of "NEW" is not advisable for most users when using
SolrCloud (and not possible if the UUID value is configured as the unique
key field) since the result will be that each replica of each document will
get a unique UUID value. Using UUIDUpdateProcessorFactory to
generate UUID values when documents are added is recommended
instead.
Field Properties by Use Case
下面總結了常見的使用案例,以及field或field type應該提供的屬性。在使用案例列表條目中的true/false所對應的屬性必須照下表正確設置,才能正常工作。如果使用案例沒有提供true/false,則該屬性的設置對案件本身不會有影響。Use Case indexed stored multiValued omitNorms termVectors termPositions docValues search within field
field檢索true retrieve contents
內容檢索true use as unique key
採用唯一鍵值true false sort on field 排序field true7 false true1 true7 use field boosts 5 提升field效能 false document boosts affect searches within
field
field檢索中會影響效能的文檔false highlighting 高亮 true4 true true2 true3 faceting 5 面 true7 true7 add multiple values,
maintaining order
新增多個值維護排序true field length affects doc score
field長度會影響文檔的分數false6 MoreLikeThis 5更多類似這樣 true 以下為數字所代表的意思
- 推薦,但非必需的。
- 如果存在的話將被使用,但是非必需的。
- 如果termVectors=true
- 必需對field進行tokenizer定義,但是它並不需要被索引。
- 說明在Understanding Analyzers,Tokenizers,和Filters。
- Term vectors 在這裡並非是強制性的,如果不為真, 那麼會對儲存的field進行分析, 所以建議使用term vectors, 但是只有當field的stored=false時。
- 無論是indexed或是docValues都必需為真,但兩者不是必需的。DocValues可以在更多高效能的個案中使用。
沒有留言:
張貼留言