diff --git a/(p,t) compression/LICENSE-Apache2.0 b/(p,t) compression/LICENSE-Apache2.0 new file mode 100644 index 0000000000000000000000000000000000000000..72f817fb44de8b9fd23fe71230b9dc5ccbe4ca35 --- /dev/null +++ b/(p,t) compression/LICENSE-Apache2.0 @@ -0,0 +1,201 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. \ No newline at end of file diff --git a/(p,t) compression/LICENSE-MIT.txt b/(p,t) compression/LICENSE-MIT.txt new file mode 100644 index 0000000000000000000000000000000000000000..6d33b12db6ddfca2826f665b907b66188822c129 --- /dev/null +++ b/(p,t) compression/LICENSE-MIT.txt @@ -0,0 +1,23 @@ +Copyright (c) 2012 Vladimir Keleshev, <vladimir@keleshev.com> + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated +documentation files (the "Software"), to deal in the Software +without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, +and/or sell copies of the Software, and to permit persons to +whom the Software is furnished to do so, subject to the +following conditions: + +The above copyright notice and this permission notice shall +be included in all copies or substantial portions of the +Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY +KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE +WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR +PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR +COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR +OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. \ No newline at end of file diff --git a/(p,t) compression/Notice.txt b/(p,t) compression/Notice.txt new file mode 100644 index 0000000000000000000000000000000000000000..308703a87659a660dc3f846ee01208f5f1c44fd5 --- /dev/null +++ b/(p,t) compression/Notice.txt @@ -0,0 +1,23 @@ +Neighborhood Preserving graph compression + + + all software is copyright 2021 + Abd Errahmane Kiouche, licensed under the Apache License 2.0. + + + +docopt.cpp + + Copyright (c) 2012 Vladimir Keleshev, <vladimir@keleshev.com> + Licensed under the MIT license (see LICENSE-MIT). + https://github.com/docopt/docopt.cpp + + Files: docopt.cpp docopt.h docopt_private.h docopt_util.h docopt_value.h + Modifications: Remove "#pragma mark" directives. + + +hash + + Copyright 2016 Emaad Ahmed Manzoor + License: Apache License, Version 2.0 + Files : hash.h diff --git a/(p,t) compression/graph.cpp b/(p,t) compression/graph.cpp new file mode 100644 index 0000000000000000000000000000000000000000..5f9be2a24cc4e27f5f18f0d6cee002cca19a58d9 --- /dev/null +++ b/(p,t) compression/graph.cpp @@ -0,0 +1,65 @@ +// +// Created by Kiouche on 1/20/2020. +// + +#include <unordered_map> +#include <unordered_set> +#include <vector> +#include "graph.h" +#include "hash.h" + + + + + +namespace std { + + vector<edge> get_edges(graph &g,bool directed){ + vector<edge> edges; + unordered_set<edge> s_edges; + for (auto & s : g){ + for (auto & d : s.second ){ + edge e,r_e; + e.first = s.first; + e.second = d; + r_e.first = d; + r_e.second = s.first; + if(!directed) { + if (s_edges.find(e) == s_edges.end()) { + edges.push_back(e); + } + } + else edges.push_back(e); + s_edges.insert(e); + s_edges.insert(r_e); + } + } + return edges; + } + + bool is_it_undirected(graph &g){ + for (auto s : g){ + for (auto d : s.second ){ + if ( g[d].find(s.first)==g[d].end()) return false; + } + } + return true; + } + + unordered_set<uint32_t > intersection(unordered_set<uint32_t> &s1,unordered_set<uint32_t> &s2){ + unordered_set<uint32_t > intersection; + if (s1.size()< s2.size()){ + for (auto n :s1){ + if (s2.find(n)!=s2.end()) intersection.insert(n); + } + } + else{ + for (auto n :s2){ + if (s1.find(n)!=s1.end()) intersection.insert(n); + } + } + return intersection; + } + + +} \ No newline at end of file diff --git a/(p,t) compression/graph.h b/(p,t) compression/graph.h new file mode 100644 index 0000000000000000000000000000000000000000..fc9c71f35f04222a2a09347acc3fe4bc0dad0d4a --- /dev/null +++ b/(p,t) compression/graph.h @@ -0,0 +1,36 @@ +// +// Created by Kiouche on 1/20/2020. +// + +#ifndef P_K_COMPRESSION_GRAPH_H +#define P_K_COMPRESSION_GRAPH_H +#include <unordered_map> +#include <unordered_set> +#include <vector> + + + + + +namespace std{ + + typedef unordered_map<uint32_t,unordered_map<int,unordered_set<uint32_t >>> node_neighbors; /// key 1 = node_id , key 2 = depth , value = nodes + + typedef unordered_map<uint32_t , unordered_set<uint32_t >> graph; /// adjacency matrix , key = node_id + + typedef pair<uint32_t , uint32_t> edge; + + vector<edge> get_edges(graph &g,bool directed); + + + bool is_it_undirected(graph &g); + + + unordered_set<uint32_t > intersection(unordered_set<uint32_t> &s1,unordered_set<uint32_t> &s2); + + +} + + + +#endif //P_K_COMPRESSION_GRAPH_H diff --git a/(p,t) compression/hash.h b/(p,t) compression/hash.h new file mode 100644 index 0000000000000000000000000000000000000000..3394fd28f4de154062a068b2340960ea6a60f20c --- /dev/null +++ b/(p,t) compression/hash.h @@ -0,0 +1,43 @@ +// +// Created by Kiouche on 1/20/2020. +// + +#ifndef P_K_COMPRESSION_HASH_H +#define P_K_COMPRESSION_HASH_H + + + + +#include <string> +#include <vector> + +namespace std { + + +/* Combination hash from Boost */ + template <class T> + inline void hash_combine(size_t& seed, const T& v) + { + hash<T> hasher; + seed ^= hasher(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2); + } + + template<typename S, typename T> struct hash<pair<S, T>> +{ + inline size_t operator()(const pair<S, T>& v) const + { + size_t seed = 0; + hash_combine(seed, v.first); + hash_combine(seed, v.second); + return seed; + } + +}; +/* End combination hash from Boost */ + + + +} + + +#endif //P_K_COMPRESSION_HASH_H diff --git a/(p,t) compression/io.cpp b/(p,t) compression/io.cpp new file mode 100644 index 0000000000000000000000000000000000000000..375bc75de8fdf88d72330115b9549131a51607c5 --- /dev/null +++ b/(p,t) compression/io.cpp @@ -0,0 +1,78 @@ +// +// Created by Kiouche on 1/20/2020. +// + +#include <fstream> +#include <iostream> +#include <string> +#include <sstream> +#include <unistd.h> +#include <vector> +#include "graph.h" +#include "io.h" +#include "hash.h" + +namespace std { + + tuple<graph,unordered_map<edge , double>> read_graph_from_file(string filename,bool directed) { + + ifstream f(filename); + string line; + graph g; + unordered_map<edge , double> edges_scores; + int loops = 0; + + if (f.is_open()) { + while (getline(f, line)) { + uint32_t src_id, dst_id; + edge e,e_r; + double score; + stringstream ss; + ss.str(line); + ss >> src_id; + ss >> dst_id; + ss >> score; + + if (src_id!=dst_id) { + e.first = src_id ; e.second = dst_id; + e_r.first = dst_id; e.second = src_id; + g[src_id].insert(dst_id); + if(!directed) g[dst_id].insert(src_id); + edges_scores[make_pair(src_id,dst_id)] = score ; + // if(!directed) edges_scores[e_r] = score; + } + else loops++; + } + } else { + cout << "Unable to open " << filename << " ! \n"; + exit(-1); + } + cout << "number of nodes = " << g.size() << endl; + cout << "loops = " << loops << endl; + return {g,edges_scores}; + } + + void graph_to_file (po::variables_map &var,double runtime, double c_rate,vector<edge> &edges){ + + ofstream file; + file.open(var["output_file"].as<string>()); + file << "Original graph" << "\t"<< var["input"].as<string>() <<endl; + file << "Directed" << "\t" << to_string(var["directed"].as<bool>()) << endl; + file << "Depth" << "\t" << to_string(var["depth"].as<int>()) << endl; + vector<double> p_values = var["proportions"].as<vector<double>>(); + int depth = var["depth"].as<int>(); + for (int i = 0;i<=depth;i++){ + file << "k " + to_string(i) << "\t" << to_string(p_values.at(i)) << endl; + } + file << "execution time " << "\t" << to_string(runtime) << endl; + file << "compression rate " << "\t" << to_string(c_rate) << endl; + + file << endl; + + for (auto e : edges){ + file << e.first <<"\t"<< e.second << endl; + } + file.close(); + } + +} \ No newline at end of file diff --git a/(p,t) compression/io.h b/(p,t) compression/io.h new file mode 100644 index 0000000000000000000000000000000000000000..6e901a5c3cdae018c2c5f43aaa88abcd00918b09 --- /dev/null +++ b/(p,t) compression/io.h @@ -0,0 +1,19 @@ +// +// Created by Kiouche on 1/20/2020. +// + +#ifndef P_K_COMPRESSION_IO_H +#define P_K_COMPRESSION_IO_H +#include "graph.h" +#include <boost/program_options.hpp> + +namespace po = boost::program_options; + + +namespace std { + tuple<graph,unordered_map<edge , double>> read_graph_from_file(string filename,bool directed); + void graph_to_file (po::variables_map &var,double runtime, double c_rate,vector<edge> &edges); + +} + +#endif //P_K_COMPRESSION_IO_H diff --git a/(p,t) compression/main.cpp b/(p,t) compression/main.cpp new file mode 100644 index 0000000000000000000000000000000000000000..ce3bc9d20300afda0857d880878610a693c7c174 --- /dev/null +++ b/(p,t) compression/main.cpp @@ -0,0 +1,95 @@ +#include <iostream> +#include <chrono> +#include "p_k_compression.h" +#include <boost/program_options.hpp> +#include "graph.h" +#include "io.h" +#include <sstream> +#include <string> +#include <fstream> +#include "hash.h" + + +using namespace std; +namespace po = boost::program_options; + + + +int main(int argc, char *argv[]) { + + srand((unsigned)time(NULL)); + + po::options_description desc("Allowed options"); + desc.add_options() + ("help", "produce help message") + ("input", po::value<string>(), "graph file name") + ("directed", po::value<bool>()->default_value(true), "is the graph directed?") + ("algorithm",po::value<string>()->default_value("Random"), "The compression Algorithm") + ("depth",po::value<int>()->default_value(2),"the depth of the compression") + ("proportions",po::value<std::vector<double> >()->multitoken(),"the preserving proportions") + ("output_file",po::value<string>(), "the path of the compressed graph"); + + po::variables_map var; + po::store(po::parse_command_line(argc, argv, desc), var); + po::notify(var); + + if(var.count("help")) { + cout << desc << "\n"; + return 0; + } + + if(!var.count("input")) { + cout << "Missing argument --input.\n"; + return 1; + } + if(!var.count("output_file")) { + cout << "Missing argument --output_file.\n"; + return 1; + } + + graph g2; + + vector<int> s ; + vector<double> c ; + // perform 30 experiments + for (int i =0; i<30;i++) { + cout << i << endl; + string file_name = var["input"].as<string>() + to_string(i) + ".txt"; + double compression_rate = 0; + auto[g, e_s] = read_graph_from_file(file_name, var["directed"].as<bool>()); + auto start = chrono::steady_clock::now(); + + if (var["algorithm"].as<string>() == "Random") { + g2 = compress_graph_basic(g, var["depth"].as<int>(), var["proportions"].as<vector<double>>(), + var["directed"].as<bool>()); + + } else if (var["algorithm"].as<string>() == "LP") { + g2 = compress_graph_LP(g, e_s, var["depth"].as<int>(), var["proportions"].as<vector<double>>(), + var["directed"].as<bool>()); + + } else if (var["algorithm"].as<string>() == "SA") { + g2 = Simulated_annealing(1000, 10, 0.99, g, var["directed"].as<bool>(), var["depth"].as<int>(), + var["proportions"].as<vector<double>>()); + }else if (var["algorithm"].as<string>() == "Greedy") { + g2 = compress_graph_greedy(g, var["depth"].as<int>(), var["proportions"].as<vector<double>>(), + var["directed"].as<bool>()); + + } + + auto finish = chrono::steady_clock::now(); + vector<edge> edges_original = get_edges(g, var["directed"].as<bool>()); + vector<edge> edges_compressed = get_edges(g2, var["directed"].as<bool>()); + + double elapsed_time = chrono::duration_cast<chrono::duration<double>>(finish - start).count(); + compression_rate = ((double) (edges_original.size() - edges_compressed.size()) / + edges_original.size()); + c.push_back(compression_rate); + s.push_back(edges_compressed.size()); + //graph_to_file(var, elapsed_time, compression_rate, edges_compressed); + cout << "compression time " << elapsed_time << endl; + cout << "compression rate " << compression_rate << endl; + } + for (int i = 0 ; i<30;i++) cout << c.at(i) << '\t' << s.at(i) << endl; + + return 0; +} diff --git a/(p,t) compression/p_k_compression.cpp b/(p,t) compression/p_k_compression.cpp new file mode 100644 index 0000000000000000000000000000000000000000..e068a8a93c11c66baca87a7d605c372cc7277efd --- /dev/null +++ b/(p,t) compression/p_k_compression.cpp @@ -0,0 +1,391 @@ +// +// Created by Kiouche on 1/20/2020. +// + +#include "p_k_compression.h" +#include <chrono> +#include <map> +#include <list> +#include <iostream> +#include <algorithm> +#include <random> +#include <queue> + +#include "hash.h" + + + +namespace std { + + + bool test_insert(edge &e, bool directed,graph &constructed_graph,graph &compressed_graph ,int k, vector<double> &p){ + uint32_t first, second; + first = e.first; + second = e.second; + if(!directed) { + if (constructed_graph[e.first].size() > constructed_graph[e.second].size()) { + first = e.second; + second = e.first; + } + } + bool insert = BFS(first,compressed_graph,constructed_graph,p,k); + if ( insert){ + return true; + } + else if (!directed) { + return BFS(second,compressed_graph,constructed_graph,p,k); + } + else return insert; + } + + void get_neighbors(uint32_t node, graph &g,unordered_map<int,unordered_set<uint32_t>> &neighbors, + int &maxDepth){ + + // Mark all the vertices as not visited + unordered_map<uint32_t , int> node_visited; + // Create a queue for BFS + deque<uint32_t> queue; + // array_queue *queue = new array_queue(); + // Mark the current node as visited and enqueue it + node_visited[node]= 1; + queue.push_back(node); + + int currentDepth = 1, + elementsToDepthIncrease = 1, + nextElementsToDepthIncrease = 0; + + while(!queue.empty()) { + // Dequeue a vertex from queue and print it + uint32_t s = queue.front(); + queue.pop_front(); + // node_visited[s] = 1; + for (auto &v : g[s]){ + if (node_visited[v]==0) { + nextElementsToDepthIncrease++; + node_visited[v]=1; + queue.push_back(v); + for (int i = currentDepth; i <= maxDepth; i++) neighbors[i].insert(v); + } + } + if (--elementsToDepthIncrease == 0) { + if (++currentDepth > maxDepth) break; + elementsToDepthIncrease = nextElementsToDepthIncrease; + nextElementsToDepthIncrease = 0; + } + } + } + + bool check_constraints(graph &original_graph,graph &compressed_graph,vector<double> p, int k){ + + for (auto v : original_graph){ + unordered_map<int,unordered_set<uint32_t>> n_v; + get_neighbors(v.first,compressed_graph,n_v,k); + for (int i=1;i<= k;i++){ + unordered_set<uint32_t> nghs = intersection( n_v[i],original_graph[v.first]); + if ((double) nghs.size() < v.second.size()*p.at(i)){ + cout << i << "---" << v.first << " " << nghs.size() << " " + << v.second.size() << endl; + return false; + } + } + } + return true; + } + + bool BFS(uint32_t node, graph &g,graph &constructed_graph,vector<double> &p,int &maxDepth){ + + unordered_map<int,unordered_set<uint32_t>> neighbors; + // Mark all the vertices as not visited + unordered_map<uint32_t , int> node_visited; + // Create a queue for BFS + deque<uint32_t> queue; + // Mark the current node as visited and enqueue it + node_visited[node]= 1; + queue.push_back(node); + + int currentDepth = 1, + elementsToDepthIncrease = 1, + nextElementsToDepthIncrease = 0; + + while(!queue.empty()) { + // Dequeue a vertex from queue and print it + uint32_t s = queue.front(); + queue.pop_front(); + for (auto &v : g[s]){ + if (node_visited[v]==0) { + nextElementsToDepthIncrease++; + node_visited[v]=1; + queue.push_back(v); + for (int i = currentDepth; i <= maxDepth; i++) neighbors[i].insert(v); + } + } + if (--elementsToDepthIncrease == 0) { + double nb_nghrs = constructed_graph[node].size(); + unordered_set<uint32_t> s_neighbors = intersection( neighbors[currentDepth],constructed_graph[node]); + if ((double) s_neighbors.size() < nb_nghrs * p.at(currentDepth)) return true; + if (++currentDepth > maxDepth) break; + elementsToDepthIncrease = nextElementsToDepthIncrease; + nextElementsToDepthIncrease = 0; + } + } + if (currentDepth <= maxDepth){ + for (int i= currentDepth;i<=maxDepth;i++){ + double nb_nghrs = constructed_graph[node].size(); + unordered_set<uint32_t> s_neighbors = intersection( neighbors[i],constructed_graph[node]); + if ((double) s_neighbors.size() < nb_nghrs * p.at(i)) return true; + } + } + return false; + } + + graph compress_graph_LP(graph &initial_graph, unordered_map<edge,double> edges_scores, int k, vector<double> p,bool directed){ + + double compression_rate; + vector<edge> inserted_edges; + + vector<pair<edge,double>> es_vec; + + for ( auto & e : edges_scores ) es_vec.push_back(make_pair(e.first,e.second)); + sort(es_vec.begin(), es_vec.end(), // sort by PL scores + [](const pair<edge,double> & l, const pair<edge,double>& r) { + return l.second > r.second; + }); + cout << es_vec.size() << endl; + graph compressed_graph,constructed_graph; + int number_edges = 0; + int i=0; + for (auto e_s : es_vec){ + edge e = e_s.first; + if (i%10000==0) cout << i<< endl; + i++; + constructed_graph[e.first].insert(e.second); + if(!directed) constructed_graph[e.second].insert(e.first); + bool insert =test_insert(e,directed,constructed_graph,compressed_graph,k,p); + + if ( insert) { + inserted_edges.push_back(e); + number_edges++; + compressed_graph[e.first].insert(e.second); + if (!directed) compressed_graph[e.second].insert(e.first); + } + } + cout << "number of edges " << number_edges << endl; + if (check_constraints(initial_graph,compressed_graph,p,k)) cout << "feasible compression!"<< endl; + return compressed_graph; + } + + + graph compress_graph_basic(graph &initial_graph, int k, vector<double> p,bool directed){ + + double compression_rate; + vector<edge> inserted_edges; + vector<edge> s = get_edges(initial_graph,directed); + std::mt19937 g(rand()); + std::shuffle(s.begin(), s.end(), g); + + graph compressed_graph,constructed_graph; + int number_edges = 0; + int i=0; + for (auto e : s){ + if (i%10000==0) cout << i<< endl; + i++; + constructed_graph[e.first].insert(e.second); + if(!directed) constructed_graph[e.second].insert(e.first); + bool insert =test_insert(e,directed,constructed_graph,compressed_graph,k,p); + if (insert){ + inserted_edges.push_back(e); + number_edges++; + compressed_graph[e.first].insert(e.second); + if(!directed) compressed_graph[e.second].insert(e.first); + } + } + cout << "number of edges " << number_edges << endl; + if (check_constraints(initial_graph,compressed_graph,p,k)) cout << "feasible compression!"<< endl; + return compressed_graph; + } + + + vector<edge> perturbate_solution(vector<edge> &s){ + vector<edge> s2; + for (auto e : s ) s2.push_back(e); + int max_perturbations = 2; //s.size()/100; + std::mt19937 gen(rand()); + std::uniform_int_distribution<> dis(0, s.size()-1); + for (int i=0;i<max_perturbations;i++){ + int r1,r2; + r1 = dis(gen); + r2 = dis(gen); + edge e = s2.at(r1); + s2.at(r1).first= s2.at(r2).first; + s2.at(r1).second= s2.at(r2).second; + s2.at(r2).first = e.first; + s2.at(r2).second = e.second; + } + return s2; + } + + + tuple<double,graph> evaluate_permutation ( vector<edge> &permutation,bool directed, vector<double> &p, + int k,graph &g){ + //double compression_rate; + graph compressed_graph,constructed_graph; + int inserted_edges = 0; + for (auto e : permutation){ + // update constructed graph + constructed_graph[e.first].insert(e.second); + if (!directed) constructed_graph[e.second].insert(e.first); + bool insert =test_insert(e,false,constructed_graph,compressed_graph,k,p); + if (insert){ + inserted_edges++; + compressed_graph[e.first].insert(e.second); + if (!directed) compressed_graph[e.second].insert(e.first); + } + } + + return make_tuple(inserted_edges,compressed_graph); + } + + + + graph Simulated_annealing ( int max_iterations, + double initial_temperature, + double decrease_factor, + graph &initial_graph, + bool directed, + int k,vector<double> p){ + auto start = chrono::steady_clock::now(); + graph gr; + vector<edge> best_permutation; + /// generate initial solution + vector<edge> s = get_edges(initial_graph,false); + best_permutation = s; + std::mt19937 g(rand()); + std::shuffle(s.begin(), s.end(), g); + + double cost_s,cost_s2,T,best; + T= initial_temperature; + cost_s = get<0> (evaluate_permutation(s,directed,p,k,initial_graph)); + gr = get<1> (evaluate_permutation(s,directed,p,k,initial_graph)); + best = cost_s; + for (int i = 0;i<max_iterations;i++){ + vector<edge> s2 = perturbate_solution(s); + cost_s2 = get<0> (evaluate_permutation(s2,directed, p,k,initial_graph)); + if (cost_s2 < best){ + best_permutation = s2; + gr = get<1> (evaluate_permutation(s2,directed,p,k,initial_graph)); + best = cost_s2; + auto finish = chrono::steady_clock::now(); + double elapsed_time= chrono::duration_cast<chrono::duration<double>>(finish - start).count(); + //cout <<i<< '\t'<<best<< '\t' << elapsed_time << endl; + } + if (cost_s2 < cost_s){ + s = s2; + cost_s = cost_s2; + } + else { + double r = ((double) rand() /(RAND_MAX)); + if (std::exp( (cost_s -cost_s2) / T) > r ){ + s = s2; + cost_s = cost_s2; + } + } + T = decrease_factor*T; + } + + if ( check_constraints(initial_graph,gr,p,k)) { + cout << "feasible solution ! " << endl; + } + + return gr; + } + + + + + graph compress_graph_greedy(graph &initial_graph, int k, vector<double> p,bool directed){ + + double compression_rate; + vector<edge> inserted_edges; + vector<edge> s = greedy_edges_order(initial_graph,directed,k); + + + graph compressed_graph,constructed_graph; + int number_edges = 0; + int i=0; + for (auto e : s){ + if (i%10000==0) cout << i<< endl; + i++; + constructed_graph[e.first].insert(e.second); + if(!directed) constructed_graph[e.second].insert(e.first); + bool insert =test_insert(e,directed,constructed_graph,compressed_graph,k,p); + if (insert){ + inserted_edges.push_back(e); + number_edges++; + compressed_graph[e.first].insert(e.second); + if(!directed) compressed_graph[e.second].insert(e.first); + } + } + cout << "number of edges " << number_edges << endl; + if (check_constraints(initial_graph,compressed_graph,p,k)) cout << "feasible compression!"<< endl; + return compressed_graph; + } + + vector<edge> greedy_edges_order(graph &g,bool directed,int k){ + vector<edge> edges = get_edges(g,directed); + map<pair<uint32_t ,uint32_t >,int> edge_score; + for (auto e : edges ){ + unordered_set<uint32_t > neighbors_u; + unordered_set<uint32_t> neighbors_v; + neighbors_u = neighbors(g,k-1,e.first,e.second); + neighbors_v = neighbors(g,k-1,e.second,e.first); + for (auto n : neighbors_u){ + if (g[n].find(e.second)!=g[n].end()) edge_score[e]=edge_score[e]+1; + } + for (auto n : neighbors_v){ + if (g[n].find(e.first)!=g[n].end()) edge_score[e]=edge_score[e]+1; + } + } + sort(edges.begin(),edges.end(),[&]( const edge &e1, const edge &e2 ) + { return edge_score[e1] > edge_score[e2];} ); + return edges; + } + + + unordered_set<uint32_t> neighbors(graph &g,int max_depth,uint32_t u,uint32_t v){ + unordered_set<uint32_t> nghrs; + // Mark all the vertices as not visited + unordered_map<uint32_t , int> node_visited; + // Create a queue for BFS + deque<uint32_t> queue; + // array_queue *queue = new array_queue(); + // Mark the current node as visited and enqueue it + node_visited[u]= 1; + node_visited[v] = 1; + queue.push_back(u); + int currentDepth = 1, + elementsToDepthIncrease = 1, + nextElementsToDepthIncrease = 0; + + + while(!queue.empty()) { + // Dequeue a vertex from queue and print it + uint32_t s = queue.front(); + queue.pop_front(); + // node_visited[s] = 1; + for (auto &v : g[s]){ + if (node_visited[v]==0) { + node_visited[v]=1; + queue.push_back(v); + nghrs.insert(v); + } + } + if (--elementsToDepthIncrease == 0) { + if (++currentDepth > max_depth) break; + elementsToDepthIncrease = nextElementsToDepthIncrease; + nextElementsToDepthIncrease = 0; + } + } + return nghrs; + } + +} \ No newline at end of file diff --git a/(p,t) compression/p_k_compression.h b/(p,t) compression/p_k_compression.h new file mode 100644 index 0000000000000000000000000000000000000000..5f96fdeff2c09c30e961a0106b29887327019ae9 --- /dev/null +++ b/(p,t) compression/p_k_compression.h @@ -0,0 +1,57 @@ +// +// Created by Kiouche on 1/20/2020. +// + +#ifndef P_K_COMPRESSION_P_K_COMPRESSION_H +#define P_K_COMPRESSION_P_K_COMPRESSION_H +#include "graph.h" + +namespace std { + + + bool check_constraints(graph &original_graph,graph &compressed_graph,vector<double> p, int k); + + + bool BFS(uint32_t node, graph &g,graph &constructed_graph,vector<double> &p,int &maxDepth); + + bool test_insert(edge &e, bool directed,graph &constructed_graph,graph &compressed_graph ,int k, vector<double> &p); + + graph compress_graph_LP(graph &initial_graph, unordered_map<edge,double> edges_scores, int k, vector<double> p,bool directed); + + void get_neighbors(uint32_t node, graph &g,unordered_map<int,unordered_set<uint32_t>> &neighbors, + int &maxDepth); + graph Greedy_compression(graph &initial_graph,unordered_map<edge,double> edges_scores, int k, vector<double> p,bool directed); + graph compress_graph_basic(graph &initial_graph, int k, vector<double> p,bool directed); + + graph random_compression_graph(graph &initial_graph,bool directed); + + vector<edge> perturbate_solution(vector<edge> &s); + + graph Simulated_annealing ( int max_iterations, + double initial_temperature, + double decrease_factor, + graph &initial_graph, + bool directed, + int k,vector<double> p); + + + + + + + + tuple<double,graph> evaluate_permutation ( vector<edge> &permutation,bool directed, vector<double> &p, + int k,graph &g); + + + graph compress_graph_greedy(graph &initial_graph, int k, vector<double> p,bool directed); + + + + unordered_set<uint32_t> neighbors(graph &g,int max_depth,uint32_t u,uint32_t v); + vector<edge> greedy_edges_order(graph &g,bool directed,int k); + + +} + +#endif //P_K_COMPRESSION_P_K_COMPRESSION_H diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..6b468505443f1f339f919b1ecf7718480f689fca --- /dev/null +++ b/LICENSE @@ -0,0 +1,201 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2021 Abd Errahmane Kiouche + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License.