Considerations for High Volume ETL

https://msdn.microsoft.com/en-us/library/cc671624.aspx?f=255&MSPPError=-2147217396

Advertisements

SQL Server CDC Table-Values Functions

Microsoft SQL Server automatically generate table valued functions for the change instance when CDC tracking is enabled on tables.

Unfortunately, these functions may report the misleading error message:

An insufficient number of arguments were supplied for the procedure or function cdc.fn_cdc_get_net_changes_ ...

The real meaning of the error message is:

  • one or more arguments are null;
  • the LSN is outside the range of min(start_lsn) and the max(start_lsn) in the cdc.lsn_time_mapping table; or
  • the from_lsn is less than the start_lsn of the capture_instance listed in the system table cdc.change_tables;
  • the from_lsn is greather than the to_lsn.

In short the lsn values specifying the range must be existing in the cdc.lsn_time_mapping table.

But, I am using the system function sys.fn_cdc_map_time_to_lsn to map the time to lsn. How come I ended up with a lsn does not exist in the cdc.lsn_time_mapping table? Yes, it will happen. The relational operator “smallest greater than” and “smallest greater than or equal” map the time to a lsn greater the the given time. For example, the max time in the cdc.lsn_time_mapping table is 2017-01-01, the return value will be null if the givn time is later than “2017-01-01”.

It is confusing.

On more thing is the the system function sys.fn_cdc_increment_lsn and sys.fn_cdc_decrement_lsn do not return the next value from the cdc.lsn_time_mapping table. It just returns the next lsn value.

HTTP Entities

HttpClient distinguishes three kinds of entities, depending on where their content originates:

streamed: The content is received from a stream, or generated on the fly. In particular, this category includes entities being received from HTTP responses. Streamed entities are generally not repeatable.

self-contained: The content is in memory or obtained by means that are independent from a connection or other entity. Self-contained entities are generally repeatable. This type of entities will be mostly used for entity enclosing HTTP requests.

wrapping: The content is obtained from another entity.

https://hc.apache.org/httpcomponents-client-ga/tutorial/html/fundamentals.html

TSQL Month End

WITH DATES AS
(
SELECT GETDATE() DT
UNION ALL
SELECT DT + 1
FROM DATES
WHERE DT < GETDATE() + 3000
)
SELECT CAST(DT AS DATE) AS DATE
, MonthEnd = CASE WHEN DAY(DT + 1) = 1 THEN 1 ELSE 0 END
, QuarterEnd = CASE WHEN DATEPART(quarter, DT + 1) != DATEPART(quarter, DT) THEN 1 ELSE 0 END
, YearEnd = CASE WHEN YEAR(DT + 1) != YEAR(DT) THEN 1 ELSE 0 END
FROM DATES
OPTION (MAXRECURSION 3000)