crp-wrkload-determination - Details for Query 20961

Submitted Time: 2026/05/01 13:14:29
Duration: 9 s
Succeeded Jobs: 46367 46368 46369 46370

Show the Stage ID and Task ID that corresponds to the max metric

digraph G { 0 [labelType="html" label=" AdaptiveSparkPlan "]; subgraph cluster1 { isCluster="true"; label="WholeStageCodegen (2)\n \nduration: 13 ms"; 2 [labelType="html" label="HashAggregate time in aggregation build: 13 ms number of output rows: 1"]; } 3 [labelType="html" label="Exchange shuffle records written: 200 local merged chunks fetched: 0 shuffle write time total (min, med, max (stageId: taskId)) 31 ms (0 ms, 0 ms, 0 ms (stage 118907.0: task 20966162)) remote merged bytes read: 0.0 B local merged blocks fetched: 0 corrupt merged block chunks: 0 remote merged reqs duration: 0 ms remote merged blocks fetched: 0 records read: 200 local bytes read: 3.0 KiB fetch wait time: 0 ms remote bytes read: 11.0 KiB merged fetch fallback count: 0 local blocks read: 43 remote merged chunks fetched: 0 remote blocks read: 157 data size total (min, med, max (stageId: taskId)) 3.1 KiB (16.0 B, 16.0 B, 16.0 B (stage 118907.0: task 20966033)) local merged bytes read: 0.0 B number of partitions: 1 remote reqs duration: 13 ms remote bytes read to disk: 0.0 B shuffle bytes written total (min, med, max (stageId: taskId)) 14.1 KiB (72.0 B, 72.0 B, 72.0 B (stage 118907.0: task 20966033))"]; subgraph cluster4 { isCluster="true"; label="WholeStageCodegen (1)\n \nduration: total (min, med, max (stageId: taskId))\n0 ms (0 ms, 0 ms, 0 ms (stage 118907.0: task 20966033))"; 5 [labelType="html" label="HashAggregate time in aggregation build total (min, med, max (stageId: taskId)) 0 ms (0 ms, 0 ms, 0 ms (stage 118907.0: task 20966033)) number of output rows: 200"]; 6 [labelType="html" label=" Project "]; 7 [labelType="html" label="Filter number of output rows: 0"]; } 8 [labelType="html" label="InMemoryTableScan number of output rows: 0"]; 9 [labelType="html" label=" AdaptiveSparkPlan "]; subgraph cluster10 { isCluster="true"; label="WholeStageCodegen (2)\n \nduration: total (min, med, max (stageId: taskId))\n0 ms (0 ms, 0 ms, 0 ms (stage 118904.0: task 20965837))"; 11 [labelType="html" label=" SerializeFromObject "]; } 12 [labelType="html" label=" MapGroups "]; subgraph cluster13 { isCluster="true"; label="WholeStageCodegen (1)\n \nduration: total (min, med, max (stageId: taskId))\n0 ms (0 ms, 0 ms, 0 ms (stage 118904.0: task 20965837))"; 14 [labelType="html" label="Sort sort time total (min, med, max (stageId: taskId)) 0 ms (0 ms, 0 ms, 0 ms (stage 118904.0: task 20965837)) peak memory total (min, med, max (stageId: taskId)) 12.5 MiB (64.0 KiB, 64.0 KiB, 64.0 KiB (stage 118904.0: task 20965837)) spill size total (min, med, max (stageId: taskId)) 0.0 B (0.0 B, 0.0 B, 0.0 B (stage 118904.0: task 20965837))"]; } 15 [labelType="html" label="Exchange shuffle records written: 0 local merged chunks fetched: 0 shuffle write time: 0 ms remote merged bytes read: 0.0 B local merged blocks fetched: 0 corrupt merged block chunks: 0 remote merged reqs duration: 0 ms remote merged blocks fetched: 0 records read: 0 local bytes read: 0.0 B fetch wait time: 0 ms remote bytes read: 0.0 B merged fetch fallback count: 0 local blocks read: 0 remote merged chunks fetched: 0 remote blocks read: 0 data size: 0.0 B local merged bytes read: 0.0 B number of partitions: 200 remote reqs duration: 0 ms remote bytes read to disk: 0.0 B shuffle bytes written: 0.0 B"]; 16 [labelType="html" label=" AppendColumnsWithObject "]; 17 [labelType="html" label="Scan number of output rows: 0"]; 2->0; 3->2; 5->3; 6->5; 7->6; 8->7; 9->8; 11->9; 12->11; 14->12; 15->14; 16->15; 17->16; }

AdaptiveSparkPlan isFinalPlan=true

HashAggregate(keys=[], functions=[count(1)])

WholeStageCodegen (2)

Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=2955561]

HashAggregate(keys=[], functions=[partial_count(1)])

Project

Filter isDir#3265229: boolean

WholeStageCodegen (1)

InMemoryTableScan [isDir#3265229], [isDir#3265229]

AdaptiveSparkPlan isFinalPlan=true

SerializeFromObject [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).path, true, false, true) AS path#3265227, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).length AS length#3265228L, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).isDir AS isDir#3265229, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).modificationTime AS modificationTime#3265230L]

WholeStageCodegen (2)

MapGroups org.apache.spark.sql.KeyValueGroupedDataset$$Lambda$6934/0x0000000801d21280@1c9d3ae7, value#3265221.toString, newInstance(class org.apache.spark.sql.delta.SerializableFileStatus), [value#3265221], [path#3265211, length#3265212L, isDir#3265213, modificationTime#3265214L], obj#3265226: org.apache.spark.sql.delta.SerializableFileStatus

Sort [value#3265221 ASC NULLS FIRST], false, 0

WholeStageCodegen (1)

Exchange hashpartitioning(value#3265221, 200), ENSURE_REQUIREMENTS, [plan_id=2955480]

AppendColumnsWithObject org.apache.spark.sql.delta.commands.VacuumCommand$$$Lambda$6931/0x0000000801d1d990@4e6d2c38, [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).path, true, false, true) AS path#3265211, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).length AS length#3265212L, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).isDir AS isDir#3265213, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).modificationTime AS modificationTime#3265214L], [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, input[0, java.lang.String, true], true, false, true) AS value#3265221]

Scan[obj#3265210]

Details

== Physical Plan ==
AdaptiveSparkPlan (26)
+- == Final Plan ==
   * HashAggregate (20)
   +- ShuffleQueryStage (19), Statistics(sizeInBytes=3.1 KiB, rowCount=200)
      +- Exchange (18)
         +- * HashAggregate (17)
            +- * Project (16)
               +- * Filter (15)
                  +- TableCacheQueryStage (14), Statistics(sizeInBytes=0.0 B, rowCount=0)
                     +- InMemoryTableScan (1)
                           +- InMemoryRelation (2)
                                 +- AdaptiveSparkPlan (13)
                                 +- == Final Plan ==
                                    * SerializeFromObject (9)
                                    +- MapGroups (8)
                                       +- * Sort (7)
                                          +- ShuffleQueryStage (6), Statistics(sizeInBytes=0.0 B, rowCount=0)
                                             +- Exchange (5)
                                                +- AppendColumnsWithObject (4)
                                                   +- Scan (3)
                                 +- == Initial Plan ==
                                    SerializeFromObject (12)
                                    +- MapGroups (11)
                                       +- Sort (10)
                                          +- Exchange (5)
                                             +- AppendColumnsWithObject (4)
                                                +- Scan (3)
+- == Initial Plan ==
   HashAggregate (25)
   +- Exchange (24)
      +- HashAggregate (23)
         +- Project (22)
            +- Filter (21)
               +- InMemoryTableScan (1)
                     +- InMemoryRelation (2)
                           +- AdaptiveSparkPlan (13)
                           +- == Final Plan ==
                              * SerializeFromObject (9)
                              +- MapGroups (8)
                                 +- * Sort (7)
                                    +- ShuffleQueryStage (6), Statistics(sizeInBytes=0.0 B, rowCount=0)
                                       +- Exchange (5)
                                          +- AppendColumnsWithObject (4)
                                             +- Scan (3)
                           +- == Initial Plan ==
                              SerializeFromObject (12)
                              +- MapGroups (11)
                                 +- Sort (10)
                                    +- Exchange (5)
                                       +- AppendColumnsWithObject (4)
                                          +- Scan (3)


(1) InMemoryTableScan
Output [1]: [isDir#3265229]
Arguments: [isDir#3265229], [isDir#3265229]

(2) InMemoryRelation
Arguments: [path#3265227, length#3265228L, isDir#3265229, modificationTime#3265230L], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@685dbe62,StorageLevel(disk, memory, deserialized, 1 replicas),AdaptiveSparkPlan isFinalPlan=true
+- == Final Plan ==
   *(2) SerializeFromObject [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).path, true, false, true) AS path#3265227, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).length AS length#3265228L, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).isDir AS isDir#3265229, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).modificationTime AS modificationTime#3265230L]
   +- MapGroups org.apache.spark.sql.KeyValueGroupedDataset$$Lambda$6934/0x0000000801d21280@1c9d3ae7, value#3265221.toString, newInstance(class org.apache.spark.sql.delta.SerializableFileStatus), [value#3265221], [path#3265211, length#3265212L, isDir#3265213, modificationTime#3265214L], obj#3265226: org.apache.spark.sql.delta.SerializableFileStatus
      +- *(1) Sort [value#3265221 ASC NULLS FIRST], false, 0
         +- ShuffleQueryStage 0
            +- Exchange hashpartitioning(value#3265221, 200), ENSURE_REQUIREMENTS, [plan_id=2955480]
               +- AppendColumnsWithObject org.apache.spark.sql.delta.commands.VacuumCommand$$$Lambda$6931/0x0000000801d1d990@4e6d2c38, [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).path, true, false, true) AS path#3265211, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).length AS length#3265212L, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).isDir AS isDir#3265213, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).modificationTime AS modificationTime#3265214L], [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, input[0, java.lang.String, true], true, false, true) AS value#3265221]
                  +- Scan[obj#3265210]
+- == Initial Plan ==
   SerializeFromObject [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).path, true, false, true) AS path#3265227, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).length AS length#3265228L, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).isDir AS isDir#3265229, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).modificationTime AS modificationTime#3265230L]
   +- MapGroups org.apache.spark.sql.KeyValueGroupedDataset$$Lambda$6934/0x0000000801d21280@1c9d3ae7, value#3265221.toString, newInstance(class org.apache.spark.sql.delta.SerializableFileStatus), [value#3265221], [path#3265211, length#3265212L, isDir#3265213, modificationTime#3265214L], obj#3265226: org.apache.spark.sql.delta.SerializableFileStatus
      +- Sort [value#3265221 ASC NULLS FIRST], false, 0
         +- Exchange hashpartitioning(value#3265221, 200), ENSURE_REQUIREMENTS, [plan_id=2955480]
            +- AppendColumnsWithObject org.apache.spark.sql.delta.commands.VacuumCommand$$$Lambda$6931/0x0000000801d1d990@4e6d2c38, [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).path, true, false, true) AS path#3265211, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).length AS length#3265212L, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).isDir AS isDir#3265213, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).modificationTime AS modificationTime#3265214L], [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, input[0, java.lang.String, true], true, false, true) AS value#3265221]
               +- Scan[obj#3265210]
,None)

(3) Scan
Output [1]: [obj#3265210]
Arguments: obj#3265210: org.apache.spark.sql.delta.SerializableFileStatus, MapPartitionsRDD[200963] at $anonfun$recordDeltaOperationInternal$1 at DatabricksLogging.scala:128

(4) AppendColumnsWithObject
Input [1]: [obj#3265210]
Arguments: org.apache.spark.sql.delta.commands.VacuumCommand$$$Lambda$6931/0x0000000801d1d990@4e6d2c38, [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).path, true, false, true) AS path#3265211, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).length AS length#3265212L, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).isDir AS isDir#3265213, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).modificationTime AS modificationTime#3265214L], [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, input[0, java.lang.String, true], true, false, true) AS value#3265221]

(5) Exchange
Input [5]: [path#3265211, length#3265212L, isDir#3265213, modificationTime#3265214L, value#3265221]
Arguments: hashpartitioning(value#3265221, 200), ENSURE_REQUIREMENTS, [plan_id=2955480]

(6) ShuffleQueryStage
Output [5]: [path#3265211, length#3265212L, isDir#3265213, modificationTime#3265214L, value#3265221]
Arguments: 0

(7) Sort [codegen id : 1]
Input [5]: [path#3265211, length#3265212L, isDir#3265213, modificationTime#3265214L, value#3265221]
Arguments: [value#3265221 ASC NULLS FIRST], false, 0

(8) MapGroups
Input [5]: [path#3265211, length#3265212L, isDir#3265213, modificationTime#3265214L, value#3265221]
Arguments: org.apache.spark.sql.KeyValueGroupedDataset$$Lambda$6934/0x0000000801d21280@1c9d3ae7, value#3265221.toString, newInstance(class org.apache.spark.sql.delta.SerializableFileStatus), [value#3265221], [path#3265211, length#3265212L, isDir#3265213, modificationTime#3265214L], obj#3265226: org.apache.spark.sql.delta.SerializableFileStatus

(9) SerializeFromObject [codegen id : 2]
Input [1]: [obj#3265226]
Arguments: [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).path, true, false, true) AS path#3265227, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).length AS length#3265228L, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).isDir AS isDir#3265229, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).modificationTime AS modificationTime#3265230L]

(10) Sort
Input [5]: [path#3265211, length#3265212L, isDir#3265213, modificationTime#3265214L, value#3265221]
Arguments: [value#3265221 ASC NULLS FIRST], false, 0

(11) MapGroups
Input [5]: [path#3265211, length#3265212L, isDir#3265213, modificationTime#3265214L, value#3265221]
Arguments: org.apache.spark.sql.KeyValueGroupedDataset$$Lambda$6934/0x0000000801d21280@1c9d3ae7, value#3265221.toString, newInstance(class org.apache.spark.sql.delta.SerializableFileStatus), [value#3265221], [path#3265211, length#3265212L, isDir#3265213, modificationTime#3265214L], obj#3265226: org.apache.spark.sql.delta.SerializableFileStatus

(12) SerializeFromObject
Input [1]: [obj#3265226]
Arguments: [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).path, true, false, true) AS path#3265227, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).length AS length#3265228L, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).isDir AS isDir#3265229, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.delta.SerializableFileStatus, true])).modificationTime AS modificationTime#3265230L]

(13) AdaptiveSparkPlan
Output [4]: [path#3265227, length#3265228L, isDir#3265229, modificationTime#3265230L]
Arguments: isFinalPlan=true

(14) TableCacheQueryStage
Output [1]: [isDir#3265229]
Arguments: 0

(15) Filter [codegen id : 1]
Input [1]: [isDir#3265229]
Condition : isDir#3265229

(16) Project [codegen id : 1]
Output: []
Input [1]: [isDir#3265229]

(17) HashAggregate [codegen id : 1]
Input: []
Keys: []
Functions [1]: [partial_count(1)]
Aggregate Attributes [1]: [count#3265330L]
Results [1]: [count#3265331L]

(18) Exchange
Input [1]: [count#3265331L]
Arguments: SinglePartition, ENSURE_REQUIREMENTS, [plan_id=2955561]

(19) ShuffleQueryStage
Output [1]: [count#3265331L]
Arguments: 1

(20) HashAggregate [codegen id : 2]
Input [1]: [count#3265331L]
Keys: []
Functions [1]: [count(1)]
Aggregate Attributes [1]: [count(1)#3265267L]
Results [1]: [count(1)#3265267L AS count#3265268L]

(21) Filter
Input [1]: [isDir#3265229]
Condition : isDir#3265229

(22) Project
Output: []
Input [1]: [isDir#3265229]

(23) HashAggregate
Input: []
Keys: []
Functions [1]: [partial_count(1)]
Aggregate Attributes [1]: [count#3265330L]
Results [1]: [count#3265331L]

(24) Exchange
Input [1]: [count#3265331L]
Arguments: SinglePartition, ENSURE_REQUIREMENTS, [plan_id=2955499]

(25) HashAggregate
Input [1]: [count#3265331L]
Keys: []
Functions [1]: [count(1)]
Aggregate Attributes [1]: [count(1)#3265267L]
Results [1]: [count(1)#3265267L AS count#3265268L]

(26) AdaptiveSparkPlan
Output [1]: [count#3265268L]
Arguments: isFinalPlan=true

SQL / DataFrame Properties

Name	Value
spark.sql.parquet.fieldId.read.enabled	true
spark.sql.parquet.fieldId.write.enabled	true