43 Detecting gaps in sequences

The content is under development and finalisation Audit analytics often requires us to check for gaps in sequences of numbers. Gaps in Sequentially numbered objects such as purchase orders, invoice numbers, cheque numbers, etc should be accounted for. Thus, auditors may require to exercise this audit check as a part of audit analytics.

43.1 When sequence numbers are available as numeric column

We will have starting and ending numbers in such sequence. Let us allocate these in two variables.

start_num <- 500301 # say
end_num <- 503500 # say

It means we have 3200 terms (say cheques) issued in the series. Further suppose, the cheque numbers issued are stored in some column say cheque_no in a given data frame, have a total count say 3177. To simulate

set.seed(123)
cheque_no <- sample(start_num:end_num, 3177, replace = FALSE)

To find out the gaps we may simply use function setdiff on these two.

setdiff(start_num:end_num, cheque_no)
##  [1] 500398 500445 500457 500716 500747 500862 501017 501018 501109 501333
## [11] 501459 501609 501823 501908 502160 502191 502609 502937 502974 503002
## [21] 503284 503385 503422

43.2 When sequence numbers are available as character() column

We may easily replicate above procedure for gap detection, even if the sequence column is of character type. E.g. If the cheque numbers have a prefix say, ‘A’, then the cheque numbers may look like-

##  [1] "A502763" "A502811" "A502527" "A500826" "A500495" "A503286" "A502142"
##  [8] "A501442" "A501553" "A501568"

In these case, we may first substitute prefix with nothing and then proceed as above.

modified_cheques <- sub("A", "", cheque_no) |> as.integer()
missing_cheques <- setdiff(start_num:end_num, modified_cheques)
missing_cheques
##  [1] 500398 500445 500457 500716 500747 500862 501017 501018 501109 501333
## [11] 501459 501609 501823 501908 502160 502191 502609 502937 502974 503002
## [21] 503284 503385 503422