💎一站式轻松地调用各大LLM模型接口,支持GPT4、智谱、星火、月之暗面及文生图 广告
[[parent-child-performance]] === Practical Considerations Parent-child joins can be a useful technique for managing relationships when index-time performance((("parent-child relationship", "performance and"))) is more important than search-time performance, but it comes at a significant cost. Parent-child queries can be 5 to 10 times slower than the equivalent nested query! ==== Memory Use At the time of going to press, the parent-child ID map is still held in memory.((("parent-child relationship", "memory usage")))((("memory usage", "parent-child ID map"))) There are plans to change the map to use doc values instead, which will be a big memory saving. Until that happens, you need to be aware of the following: the string `_id` field of every parent document has to be held in memory, and every child document requires 8 bytes (a long value) of memory. Actually, it's a bit less thanks to compression, but this gives you a rough idea. You can check how much memory is being used by the parent-child cache by consulting ((("indices-stats API")))the `indices-stats` API (for a summary at the index level) or the `node-stats` API (for a summary at the node level): [source,json] ------------------------- GET /_nodes/stats/indices/id_cache?human <1> ------------------------- <1> Returns memory use of the ID cache summarized by node in a human-friendly format. ==== Global Ordinals and Latency Parent-child uses <<global-ordinals,global ordinals>> to speed((("global ordinals")))((("parent-child relationship", "global ordinals and latency"))) up joins. Regardless of whether the parent-child map uses an in-memory cache or on-disk doc values, global ordinals still need to be rebuilt after any change to the index. The more parents in a shard, the longer global ordinals will take to build. Parent-child is best suited to situations where there are many children for each parent, rather than many parents and few children. Global ordinals, by default, are built lazily: the first parent-child query or aggregation after a refresh will trigger building of global ordinals. This can introduce a significant latency spike for your users. You can use <<eager-global-ordinals,`eager_global_ordinals`>> to((("eager global ordinals"))) shift the cost of building global ordinals from query time to refresh time, by mapping the `_parent` field as follows: [source,json] ------------------------- PUT /company { "mappings": { "branch": {}, "employee": { "_parent": { "type": "branch", "fielddata": { "loading": "eager_global_ordinals" <1> } } } } } ------------------------- <1> Global ordinals for the `_parent` field will be built before a new segment becomes visible to search. With many parents, global ordinals can take several seconds to build. In this case, it makes sense to increase the `refresh_interval` so((("refresh_interval setting"))) that refreshes happen less often and global ordinals remain valid for longer. This will greatly reduce the CPU cost of rebuilding global ordinals every second. ==== Multigenerations and Concluding Thoughts The ability to join multiple generations (see <<grandparents>>) sounds attractive until ((("grandparents and grandchildren")))((("parent-child relationship", "multi-generations")))you think of the costs involved: * The more joins you have, the worse performance will be. * Each generation of parents needs to have their string `_id` fields stored in memory, which can consume a lot of RAM. As you consider your relationship schemes and whether parent-child is right for you, consider this advice ((("parent-child relationship", "guidelines for using")))about parent-child relationships: * Use parent-child relationships sparingly, and only when there are many more children than parents. * Avoid using multiple parent-child joins in a single query. * Avoid scoring by using the `has_child` filter, or the `has_child` query with `score_mode` set to `none`. * Keep the parent IDs short, so that they require less memory. _Above all:_ think about the other relationship techniques that we have discussed before reaching for parent-child.