time_complex

Given a corpus $\mathcal{C}$ , the time complex can be computed as

Frequent pattern mining: Since the operation of Hash table is $O(1)$ , both the time and space complexities are $O(w|\mathcal{C}|)$ . $w$ is a small constant indicating the maximum phrase length, so this step is linear to the size of corpus $|\mathcal{C}|$ .

Feature Extraction: When extracting features, we take the advantage of the Aho-Corasick Automaton algorithm and tailor it to ﬁnd all the occurrences of phrase candidates. The time complexity is $O(|\mathcal{C}| + |\mathcal{N}|)$ where $|\mathcal{N}|$ refers to the total number of frequent concept candidates. As the length of each candidate is limited by a constant $w$ , $O(|\mathcal{N}|)$ < $O(|\mathcal{C}|)$ , so the complexity is $O(|\mathcal{C}|)$ .

Concept Quality Estimation: As we only labeled a very small set of concept candidates, as long as the number and depth of decision trees in the random forest are some constant, the training time for the classiﬁer is very small compared to other parts. For the prediction stage, it is proportional to the size of concept candidates and the dimensions of features. Therefore, it could be $O(|\mathcal{C}|)$ in time, although the actual magnitude might be smaller.

So the total concept extraction costs $O(|\mathcal{C}|)$ .

Word Embedding: We train word embedding for text data by Google's word2vec tool with negative sampling, the time complexity is $O(log(|\mathcal{C}|))$ .
Image Embedding: We use pre-trained VGG-19 to extract features of images, with the parallel computing platform like CUDA, the time complexity is $O(|\mathcal{I}|)$ , where the $|\mathcal{I}|$ is the size of images set.
Build graph:
- The multimodal content-based graph (MCG) can be constructed in linear time complexity $O(|\mathcal{C}|)$ .
- The co-occurrence context-based graph (CCG) is derived from MCG by traversing the $V_s$ node, so this step is linear to the size of concept $O(\mathcal{N})$ .
Graph Feature Extraction: MCG and CCG Features including similarities, importance, and entropy which computed pair-wisely, so the time complexity is $O(\mathcal{|N|^2})$

So the total graph construction costs $O(|\mathcal{C}|) + O(|\mathcal{N}|^2)$

VAE: The encoder and decoder of VAE are three layers fully connected feed forward neural network (N->5000->2000->300), the time complexity is $O(\mathcal{|N|}(5000 * \mathcal{|N|} + 5000*2000 + 2000*300)) = O(|\mathcal{N}^2|)$
GMM: The time complexity is $O(i * k * \mathcal{|N|})$ , where i is the number of iterations, $k$ is the compontent number of GMM, the final time complexity is $O(\mathcal{|N|})$ , as $i, k$ can be treated as small constant.

Fusion: We compute the KL divergence for every two K-compontents GMM, and the KL of two Gaussian distribution costs $O（d^3)$ , so the time complexity of merge operation is $O(k^2d^3)$ , as $k, d$ is small constant, the could be a constant time.

So the tital variation deep graph embedding and clustering cost $O(\mathcal{|N|}^2)$

the total time complexity is $O(|\mathcal{C}|) + O(|\mathcal{N}|^2)$